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Two important events took place 
since the bulk of the material for our 
previous review (Guertin, Frank, & 
Rabin, 1956) was gathered and or- 
ganized. The first is the publication 
of the manual for the revised WB,? 
known as the WAIS (Wechsler, 
1955); the second was the appear- 
ance of a new, rewritten, and reor- 
ganized edition of Wechsler's Adult 
Intelligence (1958). 

Although the manual was men- 
tioned in our previous review, the new 
test it introduced—the WAIS—had 
not yet become the popular instru- 
ment it is today. It seems to be re- 


1 Through July 1960. 

?'The abbreviation, WB, will be used 
throughout to indicate the Wechsler-Bellevue 
Intelligence Scale, Form I. Form II will be 
designated WB II, while WAIS signifies the 
Wechsler Adult Intelligence Scale. The names 
of the subtests also appear in abbreviated 
form throughout the paper. The single letters 
I, C, A, D, S, and V stand for the Ver- 
bal subtests: Information, Comprehension, 
Arithmetic, Digits, Similarities, and Vocabu- 
lary, respectively. The two-letter combina- 
tions PA, PC, OA, BD, and DS correspond to 
the following Performance subtests: Picture 
Arrangement, Picture Completion, Object As- 
sembly, Block Designs, and Digit Symbol, 
respectively. FS, VIQ, and PIQ stand for Full 
Scale, Verbal IQ, and Performance IQ, respec- 
tively. 


ALBERT I. RABIN 
Michigan State University 


AND CLAYTON E. LADD 


Indiana University 


placing the old WB as a research tool 
and as a clinical and assessment 
device for many good reasons. 
For reviews of the WAIS see Buros 
(1959). The present review covers 
work done with both instruments for 
it spans a period of transition. 

In the closing summary of the 
previous review, we expressed the 
hope for "the creation of a newly 
standardized instrument, similar in 
structure to the WB, but not suffer- 
ing from the numerous weaknesses." 
The WAIS, in many respects, is the 
answer to this wish. A fairly rich 
harvest of research with this method 
is critically considered in the follow- 
ing pages. It may be added, in agree- 
ment with Wittenborn (1957), that: 
There is a refreshing trend away from gross 
empirical validations which required that 
tests predict the diagnostic decisions of psy- 
chiatrists or psychologists. Instead, there 
seems to be an emphasis on the conceptual 
validity of the procedures employed in assess- 
ment (p. 331). 


The general outline of the present 
review and its organization are quite 
similar to our previous reviews 
(Guertin et al., 1956; Rabin, 1945; 
Rabin & Guertin, 1951). The 
amount of material covered under 
each rubric differs however, for some 


ü 
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currents have run dry, while previ- 
ous trickles have expanded markedly. 
The bibliographical coverage is selec- 
tive in view of differences in rele- 
vance, quality, and significance of 
the various researches reported in 
the literature.? 


As A MEASURE OF INTELLIGENCE 
Reliability 

An inspection of Wechsler’s tables 
(1958, pp. 102-3) suggests that the 
WAIS IQs and verbal subtests are 
slightly more reliable than compa- 
rable WB IQs and subtests, but that 
the performance subtests (possibly 
excepting DS) have about the same 
reliability coefficients on both tests. 
Perhaps this indication of increased 
reliability with the WAIS has cur- 
tailed the number of studies report- 
ing test-retest or split-half reliabilities 
for this test as only one has been pub- 
lished thus far. Over long periods 
ranging from 1 to 5 years and using 
bright “normals” (Bayley, 1957) or 
psychiatric patients (Armitage & 
Pearl, 1958), the WB has yielded 
test-retest correlations similar to 
those found in earlier reliability 
studies, i.e., .77—.95. 

Coons and Peacock (1959), using 
24 mental hospital patients, obtained 
test-retest correlations for all three 
WAIS IQ scores of .96 or better, and 
the standard errors of measurement 
were consistent with those obtained 


* A supplementary bibliography along with 
the references covered by this review aims at 
complete coverage of research articles em- 
ploying the adult Wechsler scales. This sup- 
plementary bibliography has been deposited 
with the American Documentation Institute. 
Order Document No. 6843 from ADI 
Auxiliary Publications Project, Photoduplica- 
tion Service, Library of Congress; Washing- 
ton 25, D. C., remitting in advance $1.25 
for microfilm or $1.25 for photocopies. Make 
checks payable to: Chief, Photoduplication 
Service, Library of Congress. 


with the standardization sample. 
From this it was inferred that: 

IQ changes on retest with different examiners 
of more than 6 points can be attributed with 
reasonable confidence to changes in the mental 
state of the patient. 


Yet, the practice effects or at least 
increments in IQ scores at the time of 
the second testing were 2.6, 8.6, 
and 5.0 points for VIQ, PIQ, and 
FSIQ, respectively. Consequently, 
the quoted inference needs a qualifi- 
cation, such as ‘‘after appropriately 
adjusting for practice effects." Test- 
retest differences were not only 
greater but also more variable for the 
PIQ than for the VIQ or FSIQs; 
thus, it was concluded that ‘‘the 
Verbal scale is a better indicator of 
the level of the original Full Scale 
performance than is the Performance 
Scale IQ.” At the subtest level, the 
test-retest reliabilities are generally 
higher than the split-half reliabilities 
reported in the WAIS manual (1955). 
D had the lowest reliability of all 
subtests with a .84; the other Verbal 
subtests (excepting C with a .89) 
were .94 or better. The Performance 
subtests averaged .88, suggesting to 
the authors that the Verbal subtests 
are more reliable than the Perform- 
ance subtests; however, one should 
remember that the practice effects 
were much more variable on the 
Performance subtests, which would 
reduce the test-retest reliability co- 
efficients. 


Comparative Validity 


WB II and WISC. Earlier com- 
parisons of the WB and WB II dis- 
closed that practice effect was ap- 
preciably greater when the WB II 
was administered first. Thus, a very 
interesting and mystifying phenom- 
enon confronted and worried Wechs- 
ler workers until Barry, Fulkerson, 
Kubla, and Seaquist (1956) failed to 
find a significant interaction between 
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practice effect and the form of the 
WB administered first. Furthermore, 
they reported lack of equivalence 
between forms for entirely different 
subtests (S, OA, and DS) than re- 
ported by earlier workers. Their 
equivalent form reliability coeffi- 
cient of .71 is consistent with earlier 
findings and is rather high since their 
range of talent (intelligence) was 
only half that of an unrestricted 
sample. 

Findings of earlier comparisons 
between the WISC and WB were 
confirmed by Price and Thorne 
(1955). Their sophisticated statisti- 
cal analysis of data disclosed a 
slightly lower WB FSIQ and VIQ, 
while PIQ was slightly higher than 
for corresponding WISC scales. Cor- 
relation between the FSIQs was very 
high for their 11.5-year-old sample 
(.89) and moderate for their 14.5- 
year-old sample (.78), but range of 
talent was considerably lower in the 
older group. 

WAIS. Cole and Webela (1956) 
reported a comparison of the WB 
and WAIS, but their restricted range 
of talent and incomplete counter- 
balancing of the form of the test with 
order of administration prevent any 
findings from being more than sug- 
gestive. — Goolishian and Ramsay 
(1956) also were interested in the 
equivalence of the new WAIS and 
the WB, so they studied the two 
arrays of test scores in their hospital 
files. While the design employs 
different subjects for the two test 
scores, thus permitting the operation 
of sampling biases, the investigators 
employed a large N. They failed to 
find the extreme differences noted by 
Cole and Webela, but five subtests 
showed significant differences be- 
tween the two tests.  Neuringer's 
careful study (1956) showed FSIQ 
and PIQ were higher for the WB, a 
finding echoed in a more subjective 


report by Sinnett and Mayman 
(1960). Dana's results (1957a), based 
upon a study of only the Verbal 
scales, revealed no significant differ- 
ences for any of the subtest com- 
parisons, a finding that is quite 
different from that of Cole and 
Webela. Then, in support of the 
large differences between forms found 
by Cole and Webela; Karson, Pool, 
and Freud (1957) reported significant 
differences for five subtests, also pro- 
viding confirmation of some of the 
Goolishian-Ramsay findings. Light 
and Chambers (1958) found, with 
defectives, that the WAIS, VIQ, and 
FSIQ were significantly higher than 
for the WB. Correlation of the FSIQ 
was .77 for their restricted range of 
talent sample. Garfield (1960) found 
BD to be ninth in WAIS subtest 
order of difficulty as compared with 
third place for the WB BD. 

It would appear that the only con- 
sistent finding with samples of aver- 
age or higher intelligence is higher 
scores on BD, DS, PIQ, and FSIQ 
for the WB; and there is little agree- 
ment as to which of the verbal sub- 
tests are lower for the WB, if any. 
Only Neuringer's study (1956) had 
all the necessary features of ap- 
propriate range of talent, sufficient 
N, unbiased samples, and appropriate 
counterbalancing to test the equiva- 
lence of the WB and WAIS. After 
correcting for range of talent, Neu- 
ringer's correlations for VIQ, PIQ, 
and FSIQ, respectively, were .89, 
.44, and .77—hardly satisfactory for 
“equivalent form" reliability. 

Other tests. Sines (1958) reported 
correlations of .77, .78, and .79 be- 
tween the Shipley-Hartford and the 
WB FS scores for three samples and 
provides regression equations for 
predicting WB FSIQ from the Shipley. 
Three tests from the Army Classifica- 
tion Battery correlated .60 to .81 
with the WB FS scores (Montague, 
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Williams, Lubin, & Geiseking 1957), 
while Murphy and Langston (1956) 
obtained a .83 between the WB FS 
score and the Army Classification 
Battery, Area Aptitude I Test. 
Higher correlations between the Re- 
vised Beta and WAIS (.81 and .83 
for Negro and white prisoners) were 
found by Panton (1960). 

Sterne (1960) reported a correla- 
tion of .84 between the Ammons Full 
Range Picture Vocabulary Test 
(FRPV) and the WAIS FSIQ for a 
sample of older medical patients. 
Allen, Thornton, and Stenger (1956), 
using college students with a mark- 
edly restricted range of talent, ob- 
tained a correlation of only .46 be- 
tween the FRPV and the WB FSIQ. 
Fisher, Shotwell, and York (1960) 
found correlations between FRPV 
and various WAIS scores ranging 
from .36 to .79 with defectives. 
Borgatta and Corsini (1960) reported 
correlations between WAIS FS scores 
and four forms of their Quick Work 
Test of .75 to .83, with the observa- 
tion that coefficients are attenuated 
by reduced range of talent. Rabin- 
owitz (1956) compared the Kent 
EGY with the WB FSIQ and found a 
correlation of .69 for hospitalized 
psychiatric patients with a normal 
range of intelligence. 

Those interested in Raven’s Pro- 
gressive Matrices often use the 
Wechsler for comparative purposes. 
Hall (1957a) found a .72 correlation 
with the WAIS FS scores, while 
Stacey and Gill (1955), working with 
the restricted range of talent found in 
samples of adult defectives, reported 
a correlation of .68 with the WB 
FSI. Urmer, Morris, and Wend- 
land (1960), and Moya-Diaz and 
Matte-Blanco (1953-55) also studied 
the matrices and Wechsler scores. 
The latter found the tests fairly 
equivalent but noted that anxiety 
and cultural factors were more im- 


portant determinants of WB scores 
than for scores on the matrices. 
Confirming this, Levinson (1959) 
employed a sample of 80% foreign 
born with two age ranges. Matrices 
scores correlated with the WAIS 
FSIQ .65 for his 60-69 year olds and 
.40 for his 70-79 year olds. As ex- 
pected, he found a negative correla- 
tion between WAIS performance and 
age, which was greater in the older 
group. Had he used WAIS weighted 
scores instead of IQ, he would have 
obtained higher and more appropri- 
ate correlations with the matrices. 

Hall (1957b) found the WB FS 
scores and Wechsler Memory Scale 
correlated .75 and concluded there 
was a large overlap in what the two 
tests measure. Strong (1959) found a 
mixture of WAIS and WB FSIQs 
correlated .63 with the Ohio Literacy 
Test for psychiatric patients. One 
would expect a higher correlation for 
weighted score than IQ since the Ohio 
Literacy Test has no correction for 
deterioration with age. 

Summary. The studies reviewed 
in this section, when compared as a 
whole with those covered in the last 
review, are very disappointing. Not 
only have the investigators failed to 
learn from others’ mistakes, but 
there seems to be little tendency to 
design critical and conclusive studies 
to resolve conflicting findings re- 
ported earlier. 

Range of intelligence in the sample 
is often ignored, frequently not re- 
ported, and only one correlational 
study employed a correction for re- 
stricted range of talent (intelligence). 

It seems useless to remind investi- 
gators that equivalence between tests 
depends upon both correlation and 
differences in mean scores, but we 
would be remiss were we not to repeat 
this again. Somewhat encouraging is 
the tendency seen to use the more 
sophisticated approaches of analysis 
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of variance and regression equations 
for specifying IQ. 


Short Forms 


The new WAIS has given a fresh 
impetus to studies involving short 
forms. In an early article concerning 
the WAIS, Doppelt (1956) decided 
upon the tetrad short form (A, V, 
BD, and PA) consisting of the two 
subtests which correlated most highly 
with their respective scale scores in 
Wechsler's standardization popula- 
tion. Doppelt presented a regression 
equation method of computing the 
FS score which was compared by 
Himelstein (1957b) with simple pro- 
rating. Himelstein found the total 
scores computed by the two methods 
correlated .99 and since the means 
were identical, concluded that the 
clinician may feel free to use either 
method. 

The Doppelt article was the partial 
stimulus for a rash of studies (Clay- 
ton & Payne, 1959; Fisher & Shotwell, 
1959; Himelstein, 1957b, 1957c; Olin 
& Reznikoff, 1957; Sines & Simmons, 
1959; Sterne, 1957; Whitmyre & 
Pishkin, 1958) reporting the applica- 
tion of Doppelt's WAIS short form to 
patient populations and generally 
concluding that this abbreviated 
scale provided about as valid an esti- 
mate of the FS score for heterogene- 
ous psychiatric subjects as for the 
standardization subjects. While cor- 
relations range from .92 to .97, it 
must be remembered that they are 
exaggerated since they represent 
correlation of parts with the whole. 
Findings for samples with restricted 
range of talent gave lower short 
form-FS correlations for homeless 
men (Levinson, 1957), mental defec- 
tives (Clayton & Payne, 1959), and 
students (Allen et al., 1956). Both 
Levinson's and Himelstein's com- 
ments (195 7a) ignore the constricting 
effect of the reduced range of talent 


in Levinson's sample on the size of 
the obtained correlation, which, when 
corrected, rises from .87 to .92. 
Sterne (1957) similarly found a lower 
correlation with organics but the ob- 
tained coefficient is highly unreliable 
with N —12. 

Using a similar formula to that 
developed by McNemar for the WB, 
Maxwell (1957) determined the cor- 
relation of all possible two, three, 
four, and five subtest combinations 
with the WAIS FS for the 300 sub- 
jects in the 25-34 age group of the 
standardization population. She 
concluded: (a) that the accuracy of 
abbreviated scales is a function of 
the number of subtests included; (b) 
that while short verbal scales are 
generally better than performance 
scales as predictors of FS scores, a 
combination of both verbal and 
performance subtests is best; (c) that 
the best WB and WAIS abbreviated 
scales are not composed of the same 
subtests; and (d) that WAIS short 
forms are more highly correlated with 
the FS than are the WB short forms. 
The last conclusion was challenged 
by Howard (1958) who contends 
McNemar made an error and under- 
estimated the correlations between 
WB abbreviated scales and FS. 
Howard (1959) also reported finding 
higher WAIS short form-FS correla- 
tions in a group of heterogeneous 
psychiatric patients than Maxwell 
found in the standardization sample, 
but he recognized that “the differ- 
ences appeared to result from the 
greater variance of the patient sam- 
ple." 

Three studies within the last 5 
years have considered the usefulness 
of WB II abbreviated scales for 
employee selection (Sloan & New- 
man, 1955), with alcoholic outpa- 
tients (Schneyer, 1957), psychotics 
and students (Caldwell & Davis, 
1956). 
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Special Populations and Applications 


Intelligence as a function of age. 
Bayley (1957) concerned herself with 
the growth of intelligence between 16 
and 21 years of age in an extension 
of the now famous Berkeley Growth 
Study. In general, subjects improved 
with each testing regardless of intel- 
lectual or educational level. Certain 
individuals, however, appeared to 
have reached their asymptote by 16 
or 18 while others continued to de- 
velop until 21 or older. Although 
acknowledging possible practice ef- 
fects, Bayley did not feel this totally 
accounted for the increments in 
performance. 

Concerned with the encroachments 
of old age in a randomly selected 
probability sample in Delaware, 
Whiteman and Jastak (1957) admin- 
istered three subtests of the WB to 
1,980 persons and found little decline 
with age on C, moderate decline on 
PC, and marked decline on DS begin- 
ning at age 35. These differential 
deficits in performance accruing with 
age were interpreted as ''a decline in 
certain group and specific factors— 
conative, perceptual, and motoric in 
nature—rather than as a decline in 
general intellectual ability per se." 
Similar interpretations of the WAIS 
standardization data were made by 
Doppelt and Wallace (1955) and 
Wechsler (1958). Comparing the WB 
standardization population with the 
WAIS standardization population, 
Wechsler (1958) noted that the best 
overall WAIS test scores occurred in 
the 25-29 age interval rather than 
the 20—24 age interval found for the 
WB standardization. Also, the gen- 
eral rate of decline was said to be less 
for the WAIS than for the WB up to 
age 50. 

Doppelt and Wallace (1955) found 
that allowing the elderly subjects un- 
limited time made very little differ- 
ence in their scores. The WAIS 


standardization population scores be- 
gan to decline with aging much sooner, 
and decrement was much more 
marked on the Performance subtests 
than on the Verbal ones. The WAIS 
Verbal subtests hold up fairly well 
until about 70 years of age at which 
time all subtest performances decline 
rapidly with age. Eisdorfer, Busse, 
and Cohen (1959) questioned the 
representativeness of the WAIS Kan- 
sas City aged sample (Doppelt & 
Wallace, 1955); however, when 162 
volunteer subjects from the Piedmont 
section of North Carolina consistently 
(8295) manifested a superiority of 
VIQ over PIQ. This Verbal superi- 
ority remained even when sex, race, 
socioeconomic, intelligence, and 
mental health differences were ana- 
lyzed separately. It is noteworthy 
that the VIQ-PIQ discrepancy for 
the entire sample is more attributable 
to an elevation of the VIQ (106.5) 
above the norm than to a depression 
of the PIQ (98.5). It may be that 
their volunteers show a greater rela- 
tive elevation of verbal skills than 
the WAIS standardization sample. 
Loranger and Misiak (1960) found 
DS performance of a group of aged 
females comparable to that of the 
Kansas City standardization sample. 

Sex differences. In the WAIS 
standardization population there 
were consistent but negligible differ- 
ences in Verbal Performance and FS 
scores in favor of the males (Doppelt 
& Wallace, 1955; Wechsler, 1958). 
Eight of the 11 subtests showed sig- 
nificant sex differences with men 


- doing better on five (I, C, A, PC, and 


BD) and women better on three (S, 
V, and DS). Apparently the rise and 
fall of the Mental Deterioration 
Index has had little effect on Wechs- 
ler’s habit hierarchy, for he now 
proposes a new ‘‘WAIS masculinity- 
femininity (MF) score" composed of 
the F total (V--S--DS) subtracted 
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from the M total (I+A+PC). In 
the Plant and Lynd (1959) norms for 
361 college freshmen there were no 
statistically significant sex differ- 
ences on any of the WAIS IQs but 
subtest scores were not reported. In 
the Berkeley Growth Study (Bayley, 
1957), males were superior on the 
Verbal scale, while females were 
higher on the Performance scale; 
however, there was no evidence for 
an earlier intellectual maturation of 
females. An unpublished thesis by 
Miele (1958) deals with sex differ- 
ences on the WAIS. 

Educational and vocational applica- 
tions. The general intellectual level 
of college students has long been of 
interest. Plant and Richardson 
(1958) recently reported a mean WB 
FSIQ of 116.5 for college freshmen 
volunteers. Wechsler (1958) reported 
a very similar mean. Plant and Lynd 
(1959) found correlations of Verbal, 
Performance, and FS WAIS weighted 
scores with grade point average for 
the freshman year were .58, .31, and 
S3, respectively, which were as good 
or better than similar correlations 
for the ACE. Their normative data 
reveal an expected restriction in 
range of talent. The WB VIQ for 
engineering students has been re- 
ported (Wechsler, 1958) to be not 
only superior to the PIQ but also 
more highly correlated (.41 vs. .08) 
with college grades. Weisgerber 
(1955) concluded that Diamond's 
factor analytically based scoring 
method designed for vocational coun- 
seling with the WB was not as useful 
as the VIO for predicting academic 
success of engineering students. At 
an even higher educational level, 
Holt and Luborsky (1959) have indi- 
cated their surprise at finding the 
WB VIQ to be one of the better 
predictors of performance in psychi- 
atric residency training in spite of the 
test's ceiling. Correlations between 


the WB VI and supervisor-peer 
ratings on diagnosis, therapy, ad- 
ministration, management, and over- 
all competence ranged from .27 to 
.47; even the correlations with em- 
pathy, interest, sensitivity, firmness, 
etc. were in the .30s. 

A very interesting and thorough 
study of the relationship between 
intelligence (WB) and rated crea- 
tively in 64 chemists engaged in 
industrial research has been reported 
by Meer and Stein (1955). Not too 
surprisingly, when the entire group 
was considered there were generally 
positive findings although not always 
significant relationships among edu- 
cation, intelligence, and creativity. 
Their probing analysis, however, led 
to the tentative conclusion that: 
Where equal opportunity is available higher 
IQ scores beyond a certain point [approxi- 
mately Percentile 95] have relatively little 
significance for creative work. 


Considering the role of intelligence 
in managerial positions, Balinsky 
and Shaw (1956) found their unique 
sample had a higher WAIS VIQ (125) 
than PIQ (117) and, after correlating 
the IQs and subtest scores with over- 
all performance ratings by superiors 
and peers, concluded that: 

Apparently verbal intelligence and especially 


arithmetical ability are important factors in 
the performance of the executive personnel. 


While one might argue with the 
authors’ phraseology—‘important 
factors"— since the data indicated 
only one (A) of the 11 subtests 
yielded a significant correlation, the 
VIQ-performance rating correlation 
of .32 was significant at the .05 level. 
Another study, by Dunnette and 
Kirchner (1958), provides some con- 
firmation of this relationship between 
intelligence and managerial effective- 
ness. 

Cultural influences, 
and _ethnic groups. 


translations, 
Bloom (1959) 
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recently compared 67 student nurses 
in Missouri with 67 in Hawaii using 
the V and PC subtests of the WAIS. 
The Missouri nurses obtained higher 
scores on both subtests (significant 
at .01 level only for V), and seven of 
the eight hypotheses about ecologic 
difficulty of PC items were con- 
firmed. In a similar fashion, Breiger 
(1956) compared the WB PA per- 
formance of 30 United States Cau- 
casians, 20 Nisei, and 10 German 
refugees. The three groups matched 
on IQ, education, urban-rural resi- 
dence, and bilingualism, scored ap- 
proximately the same on this subtest 
when evaluated in the usual manner, 
but a content analysis of stories re- 
lated to their own arrangement of 
the "Flirt" and “Taxi” items re- 
vealed marked differences. Signifi- 
cantly more Caucasians than Nisei 
project romantic implications into 
the Flirt sequence and abnormal sex 
behavior into the Taxi arrangement. 
Sullivan (1957), in testing 15 and 16 
year olds in Newfoundland, found 
rural subjects were handicapped on 
the WB. 

Numerous applications of the WB 
and WAIS to foreign populations are 
evident during this 5-year period, 
and most of these investigators have 
found it necessary to make modifica- 
tions of varying degrees to the test to 
correct for cultural biases. New 
translations of Wechsler’s third edi- 
tion have been made into French 
(Chagnon, 1955) and German 
(Wechsler, 1956). The WB has been 
translated into Danish and tried out 
with institutional cases (Mogensen, 
1958 unpublished). Italian prisoners 
have been tested (Lazzari, Ferrecuti, 
& Rizzo, 1958). Priester (1957), and 
Priester and Kukulka (1958) pre- 
sented a method of comparing 
HAWIE (German WAIS) subtest 
scatter with Wechsler’s diagnostic 
signs. He also compared the HAWIE 
with the HAWIK (German WISC) 


and the Binet-Bobertag, finding them 
sufficiently comparable to be con- 
sidered parallel tests. Cultural as- 
pects of the WAIS in Canadian sub- 
jects (Hopkins, 1957) and in British 
mental patients (Robertson & 
Batcheldor, 1956) have been reported. 
The latter authors concluded the 
British subjects were better on liter- 
ary and poorer on scientific I and V 
items than the American standardiza- 
tion sample; accuracy rather than 
speed characterized the British ap- 
proach. 

More directly to the point were a 
series of discerning articles by Levin- 
son (1958, 1959) who expounds the 
thesis that reliable and valid differ- 
ences between VIQs and PIQs are not 
necessarily the result of pathology 
but may reflect the deviant values 
associated with specific subcultures. 
He substantiates his case by citing 
the WAIS scores of 64 Yeshiva Uni- 
versity students who had been in- 
doctrinated with the traditional 
Jewish cultural values that place 
great stress upon verbal accomplish- 
ments and discount manual skills. 
This group obtained a mean VIQ of 
125.6 but a mean PIQ of only 105.3, 
with 97% of the subjects having a 
higher VIQ than PIQ. 

A well-designed investigation com- 
paring the youngest WAIS standard- 
ization group with 100 Navaho 
Indians of comparable age, sex, 
education, occupation, and rural- 
urban residence (Howell, Evans, & 
Downing, 1958) afforded a striking 
contrast with the studies of Jewish 
students. The Navaho group ob- 
tained a VIQ of 84.0 and a PIQ of 
95.4, which were significantly lower 
than those of the standardization 
group. Another group, however, 
which also stresses manipulative skills 
more than verbal accomplishments, 
the Southern Negro, showed a slight 
and nonsignificant tendency for the 
WB II VIQ to be higher than PIQ 
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(Davis, 1957). This was true for both 
his mental patients with various 
diagnoses and hospital employees, 
but perhaps most significant were 
the absolute levels (mean FSIQ 68 
for the employees and 67 for the pa- 
tients). A question concerning edu- 
cational background of these groups 
arises, and a supplementary investi- 
gation indicated that both groups 
compared favorably on amount of 
education with the 1950 census 
figures for nonwhites in Florida. 
Scarborough (1956) compared 40 
venereal diseased patients with 118 
control subjects in a complex, poorly 
designed study and derived inconclu- 
sive results. His findings suggest that 
Southern Negroes do less well on the 
WB (IQs ~80) than Southern whites 
(1Qs 90) and that the patients of 
either race do almost as well as their 
own control group. The Negro sub- 
jects in this and in the Davis study 
did relatively well on OA but poorly 
on D and DS. Just why Scarborough's 
Negro subjects from Georgia should 
average almost 13 IQ points higher 
than Davis's Negro subjects from 
Florida is puzzling. 

Some very interesting information 
about the intellectual distribution of 
3,594 unwed mothers placing their 
children for adoption in Minne- 
sota was provided by Pearson and 
Amacher (1956). The mean IQ was 
100.19 with a standard deviation of 
18.36. Although approaching a nor- 
mal distribution, there were fewer 
cases than expected between IQ 83 
and 91. The authors hypothesized 
that these deviations were due to a 
greater proportion of mothers falling 
at the extremes of the intellectual 
continuum placing their babies for 
adoption because of necessity or 
social pressure, while dull normal 
mothers more commonly keep and 
rear their illegitimate children. It is 
noteworthy that "''repeaters" ob- 
tained a mean IQ of 93.3. 


Summary. Intellectual growth, as 
defined by improved test perform- 
ance on the WAIS continues in our 
culture until 25-30 years of age, but 
wide individual differences exist in 
the age of maturation ranging from 
the early or middle teens to the late 
twenties or older. Shortly aíter the 
intellectual peak, however, aging 
makes its first encroachments upon 
perceptual and psychomotor tasks; 
only considerably later does it ap- 
preciably affect verbal skills. Whether 
Wechsler's (1958, p. 143) conceptual 
distinction between "intelligence" 
and “wisdom” (defined by reference 
to the ability of the old sage to cope 
with life's problems) is useful re- 
mains to be seen, but an obvious im- 
plication is that a test for each con- 
cept is needed at least to evaluate the 
hypothesis that both are worthwhile. 
Although sex differences have been 
demonstrated fairly consistently on 
certain subtests, IQs are usually 
comparable. In addition to age and 
sex, a variety of environmental influ- 
ences, such as subcultural background 
and values, education and vocational 
history, socioeconomic conditions, 
etc., may produce diverse and dra- 
matic effects upon intelligence test 
scores. 

Thus, the conclusion of this section 
remains essentially the same as in 
previous reviews although valuable 
new data has been added, namely, 
that a number of variables besides 
pathology affect Wechsler perform- 
ance and consequently must be 
controlled or accounted for in ade- 
quate analyses. Clearly, no one can 
criticize Dunnette and Kirchner’s 
(1958) plea for validity studies in 
specific vocational stiuations instead 
of reliance upon the assumed intrinsic 
validity of a test. 


Refinements and Critiques 


Administration and scoring. In 
contrast to the last review, only one 
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paper is concerned with item order 
and difficulty of the WB. Rubin- 
Rabson (1956) points out the time 
boundedness of previously established 
item orders and observes the unde- 
sirable "tendency for items to cluster 
in groups of similar difficulty, [and] 
an abrupt augmentation of difficulty 
from group to group.” 

Two important investigations of 
the effect of administration and test 
taking attitudes were reported. 
Masling (1959) slyly coached his 
accomplices, as he appropriately calls 
them, for “warm” and “cold” roles 
to be played when tested by unsus- 
pecting experimenters. Utilization 
of some memorized answers, taped 
sessions, and a set of judges demon- 
strated that the warm role enhanced 
the score in three ways: experimenters 
used more reinforcing comments, 
gave more opportunity to clarify and 
correct answers, and scoring was more 
lenient toward the warm subjects. 
However, these statistically signifi- 
cant differences were small. 

Nichols (1959) manipulated ego 
involvement and success experience 
for college students taking the WB. 
He concludes: 
differences in test taking attitude on the part 
of the S and minor differences in testing pro- 
cedure on the part of the E do not materially 
affect intelligence test scores. [He adds this 
important caution] However, since the sub- 
jects used in this study were all intelligent 
students who are used to taking tests and 
doing their best, the results may not be di- 
rectly applicable to clinic and hospital 
groups. 


We would add: or to children. 

The effects of a trusting or skepti- 
cal attitude in student nurses upon 
the WAIS S and PC subtests were 
investigated by Wiener (1957) who 
hypothesized that a distrustful atti- 
tude would increase the no similar- 
ity" or "nothing missing" responses 
and thus interfere with performance 
on these subtests. The attitudes were 
measured by a questionnaire, and 


distrustfulness was also presumably 
reinforced or induced by special in- 
structions. The more distrustful 
students on the questionnaire dis- 
played a stronger tendency to make 
the predicted distrustful comments 
and were lower on both S and PC 
subtests. The experimental instruc- 
tions, however, did not depress the 
subtest score but did increase the 
number of comments suggestive of 
distrust. The results are interesting 
and suggestive, but it should be 
noted that the N was small and that 
only difference scores (S—V and 
PC—V\V) were reported. 

Guertin (1959) found that various, 
controlled background noises had no 
effect on D performance with a group 
of chronic psychotics. But, again, 
distraction would be more likely for 
subjects who maintain more interest 
in their surroundings, so generaliza- 
tion about the unimportance of noise 
during D administration is most 
hazardous. Blackburn and Benton 
(1957) suggest a more reliable ad- 
ministration and scoring procedure 
for D. They present reliability data 
from several populations and give 
conversion tables. Briggs’ study 
(1960) is reassuring in that only DS 
results were appreciably affected 
when the subject was forced to 
manipulate with his nondominant 
hand. Plumb and Charles (1955) 
studied scoring disagreements to C 
responses and found that experts as 
well as graduate students disagreed 
significantly. Olin (1958) presents 
tables taking into account the sub- 
ject’s age group when prorating IQ. 
Clinicians making prorations of IQ 
in the aged from short forms should 
note that unless Olin’s procedure is 
followed, they are introducing ap- 
preciable error in estimating IQ. 

Factor analyses. Davis (1956) 
derived 10 factors from the WB sub- 
tests, many more than previously 
reported. His use of a narrow range 
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of talent emphasizes test or methods 
factors as opposed to trait factors 
and increases dimensionality. Saun- 
ders (1959a) observed that Davis 
used a nonuniform procedure in ob- 
taining intercorrelations that also 
could account for the unexpectedly 
large number of factors. Not stopping 
with criticism, Saunders devised a 
crucial test of the dimensionality of 
the WAIS. He divided the subtests 
into odd-even to increase the number 
of variables, thereby avoiding re- 
striction on the number of factors 
forthcoming. From this model study 
he concludes: 

The results are consistent with the efforts of 
some clinical psychologists to interpret the 
Wechsler ''psychogram" as a personality 
measure provided attention is given to indi- 
vidual items of C and PC. Results are also 
consistent with prior factor studies of the 


Wechsler which have found only three to five 
[group] factors. 


Cohen (1957b) found factors on the 
WAIS similar to those obtained 
earlier on the WB. Besides a strong 
general intellectual factor he found a 
verbal comprehension, a perceptual 
organization, and a memory factor. 
Findings based upon four age groups 
lead him to conclude: 

This evidence is contrary to Garrett's dif- 
ferentiation hypothesis,” which suggests a 


sharp reduction in the importance of the gen- 
eral factor by the late teens. 


He notes the exception that the 
memory factor tends to supplant 
much of the general factor in the old 
age group. He feels that the rather 
low amount of subtest specificity 
encountered helps account for disap- 
pointing outcomes with pattern anal- 
ysis. Zwart and Houwink (1958) 
also found three WAIS subtest fac- 
tors, two of which corresponded 
closely to Cohen’s factors. 

Saunders (1960b) reanalyzed his 
own WAIS data to study the factors 
involved in PC subtest items. Find- 
ings are interesting and important to 


the WAIS user since three distinct, 
clinically meaningful factors emerged. 
In another reanalysis, Saunders 
(1960a) found six factors were neces- 
sary to account for I and A responses. 
The complexity of I and the inap- 
propriateness of an over-all subtest 
score for pattern analysis is illustrated 
by the appearance of five factors in- 
volved in this single subtest. Three 
factors underlie the A subtest. 

Subtest rationale. Saunders (1959b) 
discusses the rationale of the Wechsler 
subtests in terms of clinically derived 
hypotheses that are consistent with 
early statistical findings. Cohen 
(1957a) similarly discusses WAIS 
subtest rationale in the light of his 
factor analytic findings. 

Levine (1958) concentrated on S 
and separated out the "not alike" 
responders. He found they had a 
lower mean IQ and he discusses the 
theoretical implications. In another 
study Levine, Glass, and Meltzoff 
(1957) separated out the “N” re- 
versers on DS and found they too 
were less intelligent and cognitive 
inhibition time“ (capacity to delay a 
response) was poorer than for con- 
trols. 

Matarazzo and Phillips (1955) 
were interested in the relationship 
between manifest anxiety score and 
DS performance. They believed a 
nonmonotonic function best ex- 
plained their data. When Goodstein 
and Farber (1957) examined the 
relationship between manifest anxi- 
ety and DS score, they included a 
very anxious group to extend the 
range of anxiety upward in the hope 
of clarifying the nature of the rela- 
tionship, but no significant relation- 
ship of any kind could be recognized. 

Heilbrun (1960) calculated the 
intercorrelations of four immediate 
memory tests including WB D for 
brain damaged and control patients. 
All intercorrelations for both groups 
were significant (ranging from .26 to 
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.62), suggesting a general memory 
factor but, nevertheless, of such re- 
stricted magnitudes as to dictate 
"considerable caution" in deriving 
conclusions regarding an individual's 
general memory functioning from 
only one test. 

Summary. This section represents 
interest constructively directed at 
how the Wechsler works and what 
can be done to improve it; thus, it is 
disappointing to see that there are 
somewhat fewer articles covered than 
in the previous review. However, the 
quality of the articles is generally 
good. Cohen (1957b) continued to 
contribute methodologically by using 
age groups in factor analytic design. 
Saunders (1959 a) has provided us 
with a first look at the specific and 
group factor structure of the Wechs- 
ler. His factor analyses of subtest 
items has been most productive and 
we look forward to further reports of 
these findings and the time when he 
will bring forth an up-to-date ra- 
tionale for all the subtests. Nichols’ 
(1959) manipulation of ego involve- 
ment and success experience provides 
important information and needs to 
be extended to other populations. 


THE WECHSLER As A DIAGNOSTIC AID 
Personality Variables 

Anxiety. In most studies the crite- 
rion measure for anxiety was the 
Taylor scale. Using a wide variety of 
subjects, such as psychiatric aides 
compared with outpatient state hos- 
pital patients; high and low anxiety 
groups of college undergraduates, or 
medical compared with psychiatric 
VA patients, Dana (1957b); Good- 
stein and Farber (1957); Mayzner, 
Sersen, and Tresselt (1955); and 
Matarazzo (1955) found no consistent 
relationship between the Taylor scale 
and Wechsler scores. Siegman (1956) 
found that Taylor scale anxiety was 
associated with lowered performance 
on timed subtests only. However, 


using a college population Calvin, 
Koons, Bingham, and Fink (1955) 
found a consistent relationship be- 
tween scores on the Taylor scale and 
diminished efficiency on such WB 
items as FSIQ, VIQ, V, I, D, A, BD, 
and OA. Not using the Taylor crite- 
rion, Griffiths (1958) assumed induc- 
tion of anxiety in a group of college 
freshmen exposed to an experience of 
failure in a testing situation. As 
compared to controls, significantly 
lower performance was observed on 
D and I but not on A, OA, or DS. 

Kerrick (1955) found that anxiety 
disrupted over-all performance of 
Air Force trainees on the WB, whereas 
in a similar study, Mayzner, Sersen, 
and Tresselt (1955) failed to observe 
such impairment with college stu- 
dents. Mayzner et al. hypothesized 
that the differences in the findings 
between the two studies might be 
attributable to the appreciable anxi- 
ety of Kerrick's Air Force trainees, 
who realized the greater relevance of 
the test results to their future careers 
in service, as compared with the 
college subjects. 

Miscellaneous. Tallent (1958) was 
unable to support the clinical inter- 
pretation that ninth grade boys say- 
ing yell fire" to the C “theatre” item 
are impulsive behaviorally as judged 
by their teachers. Of course, the 
negative results might equally well 
indicate that teachers have little 
recognition of their students' impul- 
siveness. Of related interest is the 
finding that ego delay function,“ as 
measured by Barron M-threshold ink- 
blots, time estimation, and Stroop 
Color-Word Test, was correlated 
with WBIQ and D (Spivack, Levine, 
& Sprigle, 1959). 

The WB has also been evaluated 
as a predictor of continuation in 
psychoanalytically oriented therapy 
(Hiler, 1958). Patients remaining in 
treatment for at least 20 sessions 
averaged about 10 points higher in 


WECHSLER INTELLIGENCE SCALES FOR ADULTS 13 


IQ (mean IQ 112) and did better on S 
but poorer on D and DS relative to 
the other subtests than the patients 
discontinuing within five sessions. 
McReynolds and Weide (1960) re- 
ported dramatic changes on DS fol- 
lowing prefrontal lobotomies, but 
the subtest given preoperatively was 
not predictive of the degree of psychi- 
atric improvement postoperatively. 


Investigations of Diagnostic Value 


Several studies regarding the gen- 
eral diagnostic usefulness of the WB 
have appeared to reinforce our cau- 
tious, skeptical approach to the 
clinical application of tentative rela- 
tionships between test results and 
psychiatric condition. Frank (1956) 
correlated and factor analyzed the 
subtest scores of 60 subjects from 
nine diagnostic groups which, in a 
previous analysis, appeared homo- 
geneous in subtest scores. Only two 
unrotated factors were isolated: VIQ 
and PIO. The conclusion was that 
"the WB does not yield significant 
data as regards psychiatric diagnosis, 
and continues to sort subjects in 
terms of intellectual factors only." 
Cohen (1955) submitted WB profiles 
of 300 male veteran patients diag- 
nosed as  psychoneurotic, schizo- 
phrenic, or brain damaged to seven 
experienced clinical psychologists and 
had them attempt to classify each 
case. Only one of the seven psycholo- 
gists correctly classified a significant 
number (132) of the 300 patients and 
only two others had above-chance 
success in the diagnosis of a single 
diagnostic group which in both cases 
was the brain damaged group. The 
judged classification correlated with 
the neuropsychiatric diagnosis is be- 
tween .13 and .22, which was deemed 
far too small to be of use clinically, 
It was concluded that there is some 
nonchance relationship between the 
WB pattern and the clinical diagnosis 
but that this relationship is detected 


by only a few clinicians and even then 
to only a degree having little practi- 
cal value. Despite these and earlier 
studies, some clinicians continue to 
use the test diagnostically with little 
hesitation. 

Almost at the other extreme, how- 
ever, are the clinicians who discount 
or disregard the possible influence of 
emotional or environmental factors 
upon IQ scores. For example, Gar- 
field and Affleck (1960) reviewed 24 
cases committed to an institution for 
the retarded but later judged not 
mentally defective and found the IQ 
played an important role in the 
commitment proceedings. In most 
of these cases serious emotional prob- 
lems, deprived environments, or un- 
cooperativeness existed but were 
neglected by the psychometrist who 
proceeded to write with finality a 
report diagnosing mental deficiency 
and indicating a poor prognosis. The 
gross misinterpretations and misuses 
of the IQ described in this article 
should arouse some concern over 
maintaining acceptable standards for 
practicing psychometrists. 

Rabin, King, and Ehrmann (1955) 
found long-term schizophrenics were 
lower than normals and short-term 
schizophrenics on the WB Vocabu- 
lary. Normals and short-term schizo- 
phrenics did not differ significantly. 
Characteristics of the stimulus word 
also affected the level of communica- 
tion; thus, it seemed that the possible 
effects of chronicity, severity of the 
pathology, type of verbal material, 
and scoring system should all be con- 
sidered in investigations involving 
verbal behavior of schizophrenics. A 
similar, detailed analysis of the WB 
Vocabulary performance of brain 
damaged patients by  Heilbrun 
(1958b) revealed no significant differ- 
ences between such patients and 
physically ill patients either in terms 
of accuracy (standard scoring) or 
mode of response (categorical, de- 
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scriptive, equivalent, or functional). 
Thus, the concept of latent aphasia” 
was not confirmed. Heilbrun (1958a) 
also assessed the discriminative effec- 
tiveness of D between brain damaged 
patients, psychotics, neurotics, physi- 
cally ill, ward attendants, and college 
students. He concluded that: 


despite the established sensitivity of the D 
test to cerebral pathology, the test still falls 
short of being a useful method of discrimi- 
nating between brain damaged and non-brain 
damaged. 


Measurement of Scatter 


Difference scores. Shortly after 
publication of the WAIS, Jones 
(1956) and McNemar (1957) cau- 
tioned that differences between sub- 
tests may not have diagnostic signifi- 
cance since the distribution of differ- 
ence scores for normals“ extends 
considerably beyond the point of 
statistical significance determined by 
the standard error of measurement, 
e.g., 30% of even the standardization 
population received a statistically 
reliable difference score between cer- 
tain subtests. The median reliability 
of these difference scores was reported 
by McNemar as being .60; hence, 
much of the difference score variance 
is attributable to errors of measure- 
ment. Fisher (1960), correcting 
Wolfensberger’s calculations (1958), 
presented a table for determining the 
significance of a difference between 
VIQ and PIQ on the WAIS and WB. 
Field (1960b), like Jones and Mc- 
Nemar, emphasized the distinction 
between the “abnormality” and the 
"reliability" of a WAIS difference 
score and presented useful tables 
indicating abnormality and statistical 
reliability of VIQ-PIQ differences 
and the reliability of subtest dis- 
crepancies singly or in combinations. 

The abnormality-reliability dis- 
tinction is easily seen by noting that 
a VIQ-PIQ discrepancy of approxi- 


mately 25 points occurred once in 
every 100 subjects in the standardiza- 
tion population; thus, a greater dis- 
crepancy might be considered signifi- 
cant or ‘‘abnormal” in a statistical 
sense (see tables by Fisher & Field). 
On the other hand, a VIQ-PIQ 
discrepancy of approximately 13 
points would occur only once in 100 
times by chance, i.e., because of errors 
of measurement associated with the 
IQ scores involved in the comparison. 
Consequently, a VIQ-PIO discrep- 
ancy of 13 or greater is not likely to 
be spurious in the sense of a measure- 
ment error, but such “real” differ- 
ences are not unusual in the general 
population until they reach the mag- 
nitude of 25 IQ points or more. Ap- 
parently this distinction has not been 
thoroughly understood or has been 
disregarded. Even Wechsler (1958, 
p. 160) said that in most instances a 
difference of 15 or more (IQ) points 
may be interpreted as diagnostically 
significant" and at a later point that 
"a deviation of two or more scaled 
score units on any subtest from the 
[subtest] mean is a convenient cut-off 
point" in defining what constitutes an 
"abnormal deviation." ^ However, 
according to Field's table involving 
the reliability of differences, a subtest 
must deviate by at least 5.75 weighted 
Score points from the mean of the re- 
maining subtests in order to be signifi- 
cant at the .05 level. 

Griffith and Yamahiro (1958) re- 
ported the reliability or stability of 
subtest scatter in a heterogeneous 
group of 55 neuropsychiatric patients 
over an interval of 1-10 years (mean 
duration 42 months). The rank-order 
correlation between subtest scores 
averaged .51 with the higher rho’s 
being associated with test-retest com- 
parisons involving the same form and 
shorter intervals. They cautiously 
conclude that: 
whether the patterns of deviation do or do not 
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have personality or psychodiagnostic validity, 
the reliability is such that they might have. 
Subtest deviation scores from Vo- 
cabulary would seem to be a depend- 
able procedure for psychiatric pa- 
tients since Kasper (1958) found no 
significant relationship between rat- 
ings of “morbidity” (Lorr's Multi- 
dimensional scale) and Vocabulary 
for psychiatric patients. 

Intellectual efficiency and. potential. 
Since the inference of intellectual 
efficiency is sometimes made from a 
minimum of intratest scatter on the 
WAIS Vocabulary, Fink and Shontz 
(1958) analyzed 100 random protocols 
from Wechsler's standardization and 
100 from physically ill patients in 
order to determine the frequency of 
0-, 1-, and 2-point responses for each 
Vocabulary item. Several deviations 
from the expected frequency for 
stimulus words were noted: e.g., 
WINTER, BREAKFAST, FABRIC, SLICE, 
ENORMOUS, SENTENCE, REGULATE, and 
REMORSE all yielded more one-point 
responses than expected for both 
groups. Brown and Bryan (1957) 
concerned themselves with an ‘‘alti- 
tude quotient" (IO based upon the 
two highest subtest scores) as an esti- 
mate of intellectual potential in 270 
young, 'nonclinic' WB subjects. 
The mean difference between FSIQ 
and the altitude quotient was 24.6, 
with a standard deviation of 8.1; this 
difference tended to diminish with 
increased intellectual maturity (CA) 
and higher IQs. A correlation of .87 
was found between the IQ and the 
altitude quotient in this group. 

Mahrer and Bernstein (1958) ex- 
plored performance on repeated 
Wechsler Verbal subtest administra- 
tions. They urged subjects to give as 
many answers as possible and scored 
only the best. IQs continued to as- 
cend upon successive administration 
and they feel that this novel approach 
gives a good indication of intellectual 


potential. This method was compared 
by Thorp and Mahrer (1959) with 
four other more easily calculated 
estimates of potential intelligence: 
(a) intersubtest variability; (b) pro- 
rating the IQ from the highest subtest 
score; (c) prorating the IQ from Vo- 
cabulary; and (d) prorating the IQ 
from the three highest subtests 
weighted by 2.5, 1.5, and 1.0, respec- 
tively, from highest to lowest. For 60 
neuropsychiatric military patients, 
only Methods 5 and d involving the 
higher subtests yielded high correla- 
tions (.80 to .90) with the potential IQ 
estimated by the more laborious 
method. Yet, Mahrer and Bern- 
stein's method yielded a higher esti- 
mate of potential intelligence “in 
almost every case" than the corre- 
sponding estimate by the other meth- 
ods. These investigators also found a 
negative correlation (—.41) between 
the FSIQ and the increase in IQ when 
potential was estimated which seemed 
largely attributable to IQs over 105, 
suggesting a ceiling effect. 

Scatter and diagnosis. By tallying 
the incorrect WAIS PC responses of 
110 normal females and 110 female 
psychiatric patients, Wolfson and 
Weltman (1960) determined the 
errors characteristic of female psychi- 
atric patients. As one might expect, 
psychotics were more likely to give a 
unique response than were neurotics 
or personality disorders, and 81% of 
the patients gave at least one unique 
response. Trehub and Scherer (1958) 
investigated the individual intersub- 
test variability within a sample of 
psychiatric patients composed of 166 
(61.7%) schizophrenics and 103 neu- 
rotics or character disorders. Their 
cutting score indicative of schizo- 
phrenia yielded 72.1% correct identi- 
fication, an improvement of 10.4% 
over the schizophrenic base rate. The 
proportion of misclassifications could 
have been further reduced by using 
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only the extremes of the distribution; 
however, this necessitates a corre- 
sponding reduction in the number of 
patients about whom diagnostic 
statements are made. 

An obvious limiting factor in the 
usefulness of any diagnostic sign is 
that it may differentiate selected 
diagnostic groups but not be uniquely 
associated with a single nosological 
category. For example, Ladd (1959) 
found that intersubtest variability 
was also greater in a brain damaged 
group than in a comparable neurotic 
group; Diller (1955) reported an in- 
flated "mean range ratio" in de- 
linquents; and Plumeau, Machover, 
and Puzzo (1960) found a higher total 
scatter index for alcoholics. Conse- 
quently, other indices are needed to 
distinguish one pathological group 
from the other and such are the goals 
of pattern analysis, to be discussed in 
à subsequent section. 

Summary. A necessary distinction 
has been drawn between the reliability 
of a difference in IQ points or subtest 
weighted scores and the frequency of 
occurrence of such differences in 
specified populations. The cautions 
against confusing the two concepts 
should be heeded. Measures of inter- 
subtest scatter frequently distinguish 
groups of delinquents, schizophrenics, 
and organics from normals; however, 
the diagnostic value of this "sign" 
alone is negligible since it is clearly 
not unique to any one diagnostic 
group or sufficiently discriminative to 
be reliable in the individual case. 
There are several fairly reliable but 
not necessarily highly correlated 
methods of estimating intellectual 
efficiency or potential, but we must 
wait hopefully for evidence regarding 
the usefulness of such measures. 


Pattern Analysis 


The performance of Wechsler’s 
group of adolescent psychopaths 


(1944) was characterized by PIQ 
>VIQ, OA+PA>BD+PC, and PA 
Tall other subtests. Using a sample 
of sex offenders ranging from 14 to 64 
years old, Wiens, Matarazzo, and 
Gavor (1959) found the PIQ-VIQ 
relationship to be supported, while 
neither Foster's (1959) adolescent 
recidivists, Field's (1960a) English 
recidivists, or Panton's (1960) pris- 
oners support it. Foster did find that 
OA+PA>BD+PC but that PA ex- 
ceeded only BD and D. Graham and 
Kamano (1958) found a pattern 
similar to Wechsler's psychopathic 
group in a sample of inmates of a 
federal institution only when they 
were also classified as unsuccessful 
readers; the successful readers“ did 
not yield the predicted pattern. 
Purcell (1956) found that in his sam- 
ple of Army trainee delinquents BD 
was least impaired, and that the most 
frequent offenders did poorest ori C, 
V, and A. 

A thorough analysis of the WB 
performance of 87 male and 80 female 
juvenile delinquents matched for age, 
grade placement, and global IQ was 
made by Diller (1955). The sexes 
were judged equally endowed with 
potential intelligence as indicated by 
prorating the three highest subtests, 
and both obtained a higher PIQ than 
VIQ. In terms of factors previously 
identified by Jastak, the delinquents 
were impaired in “verbal develop- 
ment" (V, I, C, S), “motivation” 
(A, D, DS), and mildly so in the 
“psychomotor area" (BD, DS, I, PA). 
The sexes differed in that the males 
were superior in “reality contact“ 
(C, PA, PC, OA), while the females 
had more "self control." Two indi- 
vidual subtests showed sex differences 
—PC and DS—with males doing 
better on the former and poorer on 
the latter. 


With regard to subjects addicted to 


alcohol: some chronic alcoholics 
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showed evidence of pathology (clini- 
cally as well as test-wise) typical of 
the organic as in studies by Kaldegg 
(1956) and Tumarkin, Wilson, and 
Snyder (1955); while other alco- 
holics, even after 10-30 years of 
intense indulgence, were reported to 
show no apparent gross intellectual 
deterioration (Peters, 1956). Bauer 
and Johnson (1957) found no signifi- 
cant difference on subtest perform- 
ance between chronic alcoholics as 
compared with the general run of 
neurotics or "functional" psychotics. 
Plumeau et al. (1960) found that A 
was lower for unremitted“ alco- 
holics than for either “remitted” 
alcoholics or controls. 


Effects of Organic Brain Damage 


Wechsler's patterns. — Wechsler's 
subtest patterning for organicity was 
not cross-validated by Everett(1956), 
Fisher (1958), Ladd (1959), Love 
(1955), Reitan (1959). Wechsler's 
observation that PIQ«VIQ was 
found by both Ladd and Love in their 
heterogeneous organic samples, in a 
group of organics with nonfrontal 
lobe lesions by Morrow and Mark 
(1955), in a group with right hemi- 
sphere damage by Klove (1959), ina 
group demonstrating poor “spatial 
integration" by Klove and Reitan 
(1958), and in a group of normal 
senescents of superior intellectual 
ability by Norman and Daley (1959). 
Eisdorfer, Busse, and Cohen (1959) 
found PIQ<VIQ for an aged group 
and Morrow and Mark observed this 
relationship in their organics grouped 
by foci. 

With regard to Wechsler's '*Hold- 
Don't Hold" ratio: Reitan (1959) 
found some support for this pattern 
when using a pathological group as 
compared to Norman and Daley 
(1959) who did not when using nor- 
malsenescents. In this ratio it is as- 
sumed that C, I, PC, and OA will be 


resistive to the effect of factors con- 
tributing to intellectual deterioration. 
Reitan's (1956) organics did not do 
well on C and I, as compared to the 
organics seen by Howell (1955); 
Inglis, Shapiro, and Post (1956); 
Klove and Reitan (1958); and Mor- 
row and Mark (1955). The organic 
samples of Klove, Klove and Reitan, 
and of Morrow and Mark, and Nor- 
man and Daley's senescents did not 
do well on PC. None of the organics 
assessed by Ladd (1959), Morrow 
and Mark, or Norman and Daley's 
senescents did well on OA, although 
Klove's organics did. 

In Wechsler's ratio it is also as- 
sumed that D, A, BD, and DS will be 
most affected by factors contributing 
to intellectual deterioration. In gen- 
eral, this was supported by the find- 
ings of Klove and Reitan (1958), 
Ladd (1959), Love (1955), Norman 
and Daley (1959), and Reitan (1956). 
However, neither Heilbrun (19582), 
Reitan (1959), or Ladd found that D 
was significantly lower for their or- 
ganics, whereas Klove and Reitan, 
Morrow and Mark (1955), and Tolor 
(1956, 1958) did. Klove (1959) found 
that low D and A were characteristic 
of his sample of patients with left 
hemisphere damage only. The find- 
ings of Heilbrun (1959), Howell 
(1955), Klove, and Parker (1957) all 
attest to the significantly poor per- 
formance of organics on BD, and 
Thaler (1956) found that decrements 
in BD were directly related to aging. 
This is contrary to the performance 
of the organics seen by Fisher (1958), 
and Inglis et al. (1956), or the senes- 
cents seen by Norman and Daley. 
Neither Fisher nor Howell found that 
their samples of organics demon- 
strated any unique difficulty on DS; 
however, the groups seen by Klove, 
Klove and Reitan, and Morrow and 
Mark did. Moreover, the data of 
Loranger and Misiak (1959), Nor- 
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man and Daley, and Thaler demon- 
strate that DS performance, as with 
BD, declined with age. Yet Hall 
(1956) observed that the organic 
pattern, DS--BD <I +V, frequently 
occurred in nonorganic patients. 

Hewson ratio. Everett (1956) 
found no significant relationship be- 
tween the presence of organicity and 
the Hewson ratio while McKeever 
and Gerstein (1958) found that the 
Hewson ratio classified 75% of a 
group of schizophrenics as organics. 
Bryan and Brown (1957) found that 
the Hewson ratio identified 27% of a 
nonorganic group as organic, that 
38% of a group of adolescents sus- 
pected of having CNS involvement 
on the basis of clinical data were 
identified as organic, but that 67% 
of patients with known organic in- 
volvement of a “mild” degree, and 
96% of patients with a ‘‘moderate” 
to "marked" degree of organic im- 
pairment were correctly identified as 
organic. 

Effects of specific organic involve- 
ments. Bressler (1956) found that 
PIQ significantly differentiated 
aphasics from normals, but not or- 
ganics with aphasia from those with- 
out. Fisher (1958) found that 
paretics demonstrated selective im- 
pairment on subtests and that paresis 
affects verbal abilities to as great an 
extent as performance. Klove and 
Reitan (1958), Milner (1958), and 
Reitan (1955) found that patients 
with left hemisphere lesions do poorer 
on verbal tests as compared to those 
with right hemisphere lesions, the 
latter doing poorer on performance 
tests.  Heilbrun (1956) also found 
lower verbal scores for left hemi- 
sphere lesions but failed to find that 
their performance scores were better 
than for the right hemisphere group. 
Bortner and Birch (1960) found left 
hemiplegics had more difficulty with 
BD than right hemiplegics but the V 


was small and the task involved only 
recognition. 

Thaler (1956) found that patients 
with normal and focal EEG tracings 
perform better on such tests as V, I, 
BD, and DS as compared to those 
with mixed or diffuse tracings. How- 
ever, Morrow and Mark's data (1955) 
suggest (a) no significant difference 
in the performance of patients with 
either focal or diffuse cortical lesions; 
(b) patients with frontal lobe lesions 
showed only slight intellectual im- 
pairment, save on DS, while patients 
with lesions dorsal to the Rolandic 
fissure demonstrated a tendency to- 
ward greater intellectual impairment; 
and (c) patients with left hemisphere 
damage demonstrated a tendency to 
loss in VIQ and PIQ, whereas pa- 
tients with bilateral lesions showed 
loss in PIQ only. 

Summary. Research findings in 
this section are at best inconsistent 
and, hence, inconclusive. One study 
demonstrated a superiority of predic- 
tions based on behavioral data as 
compared to a few a priori test pat- 
terns (Gaston, 1959) and another, 
the difficulty of even some seasoned 
clinicians to sort test profiles into 
gross categories of neurosis, schizo- 
phrenia, and organicity (Cohen, 
1955); and Frank (1956) found the 
same inability to sort patterns even 
when the “‘sorter” is factor analysis. 
Yet in spite of the continued equivo- 
cality of the findings, faith persists in 
the assumption that a test of cogni- 
tive functions should be able to reveal 
more about a person than just his IQ. 
This faith may not be completely 
unjustified. 

One might ask whether the sup- 
portive evidence might not be chance 
phenomena, whether the persistent 
inconsistency of the findings from 
review to review does not strongly 
suggest the fruitlessness of attempt- 
ing to make assessment of Wechsler 
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patterns. Yet the frequent occur- 
rence of positive studies may be re- 
garded as evidence that analysis of 
patterns can be meaningful and that 
something other than the tool itself 
might account for the failure of the 
research to provide consistent and 
definitive answers. 

One of the methodological short- 
comings is the failure to distinguish 
between a mean diagnostic group 
profile and modal patterns of homo- 
geneous subjects in a diagnostic 
group. While there is only one mean 
group profile for a sample, several 
groups of the subjects may form 
clusters of homogeneous symptoms 
with rather dissimilar modal patterns. 
Furthermore, the group profile can- 
not be expected to conform to any of 
the modal patterns since it is a 
statistic and no single subject should 
be expected to correspond to the 
mean group profile. Only modal pat- 
terns are appropriate for diagnostic 
purposes. Wechsler (1944) fails to 
identify the nature of his proposed 
diagnostic patterns. Since only one is 
given for each diagnostic group it 
seems likely that he has proposed the 
relatively useless group profile; at 
least, this is presumed by most in- 
vestigators in checking the validity 
of his proposals. Only a clear under- 
standing of these simple principles 
can lead to a respectable research ap- 
proach to diagnostic pattern analysis. 

An analysis of the investigations 
beyond the results per se suggests 
that much is still to be desired with 
regard to the designs of the research. 
For instance, from a purely methodo- 
logical point of view, one might 
wonder whether or not clinical facts 
are being sacrificed for statistical 
significance. In light of the many 
variables other than intelligence and 
psychopathology that tend to affect 
subtest performance and greatly ex- 
pand error variance, the arbitrary 


limits of the .01 or .05 level of confi- 
dence might be too high. Yet a pat- 
tern that fails to discriminate be- 
tween groups at these levels of 
confidence would seem too weak to 
use clinically with individuals. 

One might also be disappointed 
at the seeming lack of flexibility 
and/or creativity regarding the form 
of these experiments. The majority 
of the studies employed the matched 
group design using a statistically 
simple test of an inference (chi square 
or ). Zero-order statistics are seldom 
suited to the complex analysis or 
identification of multidimensional 
patterns. Of the many studies sur- 
veyed in this section only two went 
beyond the single or simple multiple 
correlational techniques into factor 
analysis, only six went beyond a ¢ or 
the utilization of F as a multiple t, 
and only three studies made use of an 
analysis of variance design to test 
interaction effects. 

One might also show concern re- 
garding the samples of subjects upon 
which the conclusions are based. 
Samples of organics employed have 
been observed to contain such dis- 
parate kinds of pathology as tumors, 
vascular pathologies, infectious dis- 
eases, various kinds of head trauma, 
epilepsy, and developmental anom- 
olies. Included in a single sampling 
have been patients with lesions which 
have been both focal and diffuse, have 
involved different lobes, have been 
uni- and bilateral, or have been both 
cortical as well as subcortical in na- 
ture. Similarily, in the research on 
the "character disorder,” the sorts of 
behavior included in such a grouping 
might vary from such offenses as 
delinquency, to burglary and dope 
peddling, to assault, rape, and arson. 

One might note that McKeever 
and Gerstein (1958) found that meas- 
ures of organic deterioration varied 
systematically with age, and Fry 
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(1956) found that the process of 
deterioration was not the same in 
people with limited intellectual ca- 
pacity as compared to others. Group- 
ing subjects by the criteria involved 
in the classification of ‘character 
disorder,” or some synonymous 
phrase, proves to be no more valid. 
Lazzari, Ferracuti, and Rizzo (1958) 
found significant differences in the 
mean IQ of samples of delinquents 
just on the basis of crime committed, 
ie., fraud vs. rape. Wiens, Mata- 
razzo, and Gavor (1959) found that 
upon more extensive and intensive 
study, patients initially diagnosed as 
“character disorder" turned out to be 
sociopathic personalities, inadequate 
personality types, mental defectives, 
adjustment reactions in adolescence, 
adult situational reactions, depressive 
reactions, neurotics, schizoid per- 
sonalities, and even schizophrenics 
and organics. Therefore, there is 
reason to assume that such heteroge- 
neous groupings introduce a variety 
of systematic effects which would 
detract from the identification of 
consistent and meaningful patterns 
associated with the disorder. 

In the experiments reviewed herein, 
investigators have attempted to off- 
set the effect of certain variables by 
the method of randomization. Yet 
there is some doubt (Cohen, 1955) 
that this is an entirely effective 
procedure in equalizing the influence 
of such factors as age, IQ, education, 
etc. It is still not certain whether 
some of the confusion in the findings 
might not be attributed to the inade- 
quacy of such a procedure. No in- 
vestigators actually sought to deter- 
mine whether the range of age, edu- 
cation, and IQ within the samples 
made for a significant lack of homo- 
geneity of the groups. 

It would appear that systematic 
research is still necessary to satisfac- 
torily establish diagnostic patterns. 


We wonder if the present interest in 
pattern analysis of organic brain 
diseased patients will persist or will 
it, like the former search for schizo- 
phrenic signs, being unrewarded, 
evaporate. We hope that Reitan’s 
current use of carefully specified types 
of organic patients for investigation 
will yield significant patterns and 
point the way for similar investiga- 
tion of homogeneous groups of schizo- 
phrenics. 


GENERAL SUMMARY 


The WAIS is a much improved 
instrument when compared with its 
predecessors. It measures pretty 
much the same thing that a number 
of other standardized methods at- 
tempt to do. However, comparative 
studies of the instrument suffer from 
methodological shortcomings and rely 
excessively on correlational tech- 
niques and insufficiently on compari- 
sons of mean scores. 

The test has quickly become do- 
mesticated in the various research 
and clinical settings and has produced 
some interesting findings reflecting 
age differences, sex differences, and 
relationships with an array of differ- 
ent educational, vocational, socio- 
economic, and environmental factors. 
There is, perhaps, a need to attempt 
to set up such studies in a broader 
and deeper theoretical framework 
rather than to continue isolated 
forays in the flatlands of pure em- 
piricism. Wechsler (1958) has „be- 
come increasingly convinced that 
intelligence is most usefully inter- 
preted as an aspect of the total per- 
sonality ... an effect rather than a 
cause.“ 

Actually the studies on anxiety, 
impulsiveness, distrust, etc. included 
in this review are beginnings in the 
right direction. Inferring other per- 
sonality variables from intellectual 
functioning is really an important 
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avenue to diagnosis. When the con- 
cept of diagnosis is thus more broadly 
conceived, as personality assessment, 
we need not concur with Meehl 
(1960) in his pessimistic prognostica- 
tion. 

The additional work on "scatter," 
profiles, and patterns has not led us 
on more solid diagnostic ground. The 
results with the several nosological 
categories are inconclusive. Severe 
methodological shortcomings of the 
investigations prevent the isolation 
of modal profiles useful for diagnosis. 
It is perhaps time to face the chal- 
lenge embodied in Binder’s (1956) 
study of schizophrenia. Is there a 
differential intellectual impairment? 
Binder answered the question in the 
negative by demonstrating an over-all 


reduction in schizophrenic function- 
ing when assessed with an instrument 
(SRA tests) which measures rela- 
tively independent abilities. Rela- 
tively independent factors of mental 
ability, isolated from the WAIS, 
might serve as a sounder basis for 
future diagnostic studies of noso- 
logical groupings. 

Finally, we again must mention 
the inadequacy (heterogeneity) of the 
criterion—schizophrenia, character 
disorders, etc. We discussed the issue 
in detail elsewhere (Rabin & King, 
1958) and have recommended The 
selection of a specific frame of refer- 
ence in the determination of sam- 
ples... chronicity, or reactive vs. 
process” as an avenue and approach 
to more fruitful research. 
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The history of science offers many 
examples of potentially useful theo- 
ries that did not realize their promise 
until appropriate methods had been 
devised. Once methodology becomes 
available, a flood of research often 
follows, which in turn tries the theory 
and results in new formulations that 
in their turn wait on experimental de- 
vices. Such progress may falter either 
because of an absence of theoretical 
speculation or the lack of methods, 
and it is futile to assign prior im- 
portance to one or the other. 

Such interdependence becomes 
clear when we compare the influence 
that Binet’s scales had on the theories 
of intellective behavior in children to 
the relative dearth of systematic re- 
search on early personality develop- 
ment, although there is certainly no 
lack of theories concerning the latter 
problem. However, standardized 
methods for appraising personality 
variables in preliterate children are in 
short supply. 

Reasons for the dearth of methods 
are not hard to find. Research with 
preschool children presents certain 
special problems. The instructions 
and operations must be simple enough 
for young children to understand. 
The subjects must have the physical 
abilities to perform whatever acts 
are demanded by the method. Per- 
haps most important, the tasks must 
entice and maintain interest against 
a brief span of attention. Further, 


! We are indebted to the participants in the 
workshop on doll play methods at the 1957 
American Psychological Association meetings 
for many suggestions which appear in this 
Paper. We especially thank Mary Ford, Clara 
Melville, Judy Rosenblith, and Richard 
Walters for their comments on the manu- 
Script, 
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for personality research minimal de- 
termination of the child's behavior as 
artifacts of measuring devices is de- 
sirable. 

Few extant methods meet these 
criteria. Bellaks' CAT (1950) and the 
Rorschach (e.g, Ames, Learned, 
Metraux, & Walker, 1952) are widely 
and profitably used but present prob- 
lems. They rely solely on children's 
language responses and are, in the 
writers' opinions, too dependent on 
passive, nonacting out, behavior. 
Likewise, interviews with children, 
because they depend on the child's 
understanding of language may intro- 
duce many idiosyncratic factors. 

Doll play has offered the promise of 
range and flexibility in personality 
research. It is the purpose of this 
paper to summarize the research uses 
of doll play and to assay the results of 
the promise which this method offers. 

Doll play started as a clinical de- 
vice. Anna Freud (1928) attributes 
its first use to Melanie Klein who 
employed it as a procedure both for 
the diagnosis and treatment of dis- 
turbed children. However, the con- 
cern of this survey is with the research 
rather than the clinical uses of doll 
play. By "research uses" we shall 
mean that variables have been meas- 
ured by this method and related to 
other variables. The studies reviewed 
cover the period 1933 to 1960. 

What is doll play? There are nu- 
merous variations, but essentially the 
young child is presented with a set of 
dolls—such asa family—and a setting 
in which dolls are to operate—such as 
a home—and told to manipulate the 
dolls while he tells a story about them. 
The child has an opportunity here to 
talk as well as to act. Endless changes 
can and have been rung on this basic 
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theme: the composition of the dolls, 
the nature of the setting, the amount 
and kinds of interaction with the re- 
searcher, the directions and structure 
presented to the child, etc. A host of 
variables has been scored from the 
children's protocols. Chief among 
them are aggression, stereotypy, and 
prejudice. The method appears to be 
more useful for some of these vari- 
ables than for others. 

Although doll play was used in re- 
search before their work, a strong 
impetus to doll play as a method in 
the study of personality development 
was the work of R. R. and Pauline S. 
Sears at the Iowa Child Welfare Re- 
search Station in the mid-1940's. 
The original studies, after Bach 
(1945) indicated the potential value 
of the method, were methodological 
and will be discussed below. The 
serendipitous occurrence of marked 
sex differences in doll play led to more 
recent work by these investigators 
and their co-workers on identification 
in children, providing one case where 
a characteristic of a method led to 
new theory, rather than the converse 
textbook ideal. 

Because of the theoretical disposi- 
tions of the early investigators, the 
most frequent variables measured 
were derived from behavior theory 
and were indices of acquired drives in 
children. Hence, more than any 
other behavior, fantasy aggression 
has been measured by this technique, 
and it was the happy confluence of 
theory and method that this particu- 
lar behavior is frequently elicited in 
doll play. 

A host of differences exist among 
the subjects, equipment, and proce- 
dures in the studies which will be re- 
viewed. The following sections are 
organized to indicate the modal find- 

ings or procedure and to sketch the 
range of variations from the typical 
occurrence. 


METHOD 
Subjects 

The usual subjects in doll play re- 
search have been preschool children. 
However, children between the ages 
of 5 and 10 have been used in many 
investigations, and in three studies 
the subjects were up to 13 years old 
(Honzik, 1951; Levy, 1933; Witkin, 
Lewis, Hertzman, Machover, Meiss- 
ner, & Wapner, 1954). Subjects at 
the extremes of the age range have 
usually required procedural adapta- 
tions. Heinicke (1956), in his study 
of 2-year-olds, found that children of 
this age did not use dolls as agents of 
actions. He felt that he gathered 
meaningful data about his subjects 
by putting them in a doll play situa- 
tion, but in view of the types of 
variables which yielded results—rate 
of play, calling for parents, seeking 
the observer’s affection, hostility to 
dolls and other play objects—it would 
seem that the findings were incidental 
to the doll play method. At the upper 
age levels subjects usually have been 
instructed to regard the dolls as char- 
acters in a play or movie. Only one 
study has used adult subjects— 
Rosenzweig and Shakow (1937) com- 
pared the constructions of play ma- 
terials by adult psychotics to those of 
normal adults, and concluded that 
their subjects responded favorably to 
the technique. 

Most research with doll play has 
employed white subjects. Occasion- 
ally, in studies of racial identification 
and prejudice Negroes have been 
used (Goodman, 1952; Graham, 1955; 
Radke & Trager, 1950; Stevenson & 
Stewart, 1958). The method has also 
been used successfully with American 
Indian groups (Gewirtz, 1950) and 
with children in a primitive society 
(Henry & Henry, 1944). 

Both boys and girls have served as 
subjects. The only indication that 
there might be sex differences in will- 
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ingness to play with dolls comes from 
Finch (1954), who reported that 
among her subjects, aged 3-8, those 
from all-boy families refused to par- 
ticipate. However, since her proce- 
dure involved doll play in the home 
as well as in the laboratory, the 
findings may not be typical. 


Equipment 

There is no standard material for 
the construction of dolls—they may 
be made of plastic, wood, clay, 
celluloid, rubber, stuffed fabric, or 
pipe cleaners and cardboard. ‘Their 
clothing may be nonexistent, simple 
or elaborate, removable or permanent. 
However, in the majority of studies 
reported, the dolls were 1.5"—6" tall, 
realistically dressed, and were flexible 
so that they could be bent to standing 
or sitting positions. The ''standard 
doll play family,“ to the extent that it 
exists, consists of father, mother, boy, 
girl, and baby. This number can 
either be reduced or expanded to 
study particular interactions—e.g., 
restricted to mother and child (Isch, 
1952); to mother, baby, and older 
brother or sister (Levy, 1933); or ex- 
panded to include maid (Bryan, 1940) ; 
teacher (Bach, 1945; Melville, 1959); 
grandparents (Halnan, 1950; Johnson 
1952); or additional siblings (Bryan, 
1940). Sometimes the subject is given 
dolls which duplicate his own family 
(Bremer, 1947; Halnan, 1950; Hol- 
way, 1949; Johnson, 1952; Radke, 
1946; Ryder, 1954), or he is presented 
with a large number of dolls of differ- 
ent age-sex categories, and given his 
choice (Goodman, 1952; Henry & 
Henry, 1944; Korner, 1949). 
. The dolls are typically presented 
In, or in front of, some indoor setting. 
Most common is the use of a five- or 
SIx-room house which has fixed 
wooden or cardboard walls, but no 
roof. The house is usually filled with 
realistic, movable doll furniture which 


has few manipulatable parts. Some- 
times no house is used—instead, the 
child is given furniture which is 
either organized into “rooms” or 
lined up in rows (Bryan, 1940; Finch, 
1954; Goodman, 1952; Halnan, 1950; 
Holway, 1949; Johnson, 1952; Korner 
1949; Phillips, 1945; Pintler, 1945; 
Radke, 1946; Robinson, 1946; Ryder, 
1954). Occasionally blocks are avail- 
able, making it possible for the child 
to construct walls if he desires them 
(Bryan, 1940; Pintler, 1945). Settings 
other than houses have been em- 
ployed in rare instances—e.g., a com- 
plete neighborhood (Meister, 1948), 
a school room (Bach, 1945; Melville, 
1959), or a scale model of a backyard 
filled with play equipment (Bremer, 
1947). 


Procedure 


Typically, the subject is brought 
into an experimental room, shown the 
dolls and other equipment, and told 
that he may play with them in any 
way he wishes. Sometimes it is sug- 
gested that he make up a story (Bach, 
1945, 1946; Bach & Bremer, 1947; 
Hollenberg, 1949; Johnson, 1951; 
Krall, 1953; Levin, 1955) but even in 
these cases the direction of the fan- 
tasy is left completely to the child. 

The interaction between experi- 
menter and subject is usually con- 
trolled to some extent—the experi- 
menter may avoid interaction when- 
ever possible (Bryan, 1940; Honzik, 
1951); he may limit the frequency of 
interaction, usually according to the 
levels established by Pintler (1945), 
which will be discussed later; or he 
may control the situation only in the 
sense of adopting a constant attitude 
of noninterfering permissiveness and 
attentiveness (Bach, 1946; Bach & 
Bremer, 1947; Holway, 1949; Levin, 
1955; Ryder, 1954). 

In studies whose primary aim is to 
compare the results of free doll play 
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with results of other measures of 
personality (Ryder, 1954; Simpkins, 
1948; Witkin et al., 1954) one session 
of play may be all that is used. Most 
experiments, however, provide for 
the analysis of  session-to-session 
changes, usually with two 20-minute 
sessions a few days apart. Some 
studies have used more than two ses- 
sions (Bremer, 1947; Heinicke, 1956; 
Hollenberg & Sperry, 1951; Isch, 
1952; Johnson, 1951; Phillips, 1945; 
Pintler, 1945). 

In most cases, the session length is 
determined beforehand, but even 
though measures are taken only dur- 
ing the standard time, the subject 
may be allowed to continue playing 
as long as he wants (Bremer, 1947). 
Some workers have not limited the 
session time, but have recorded all 
responses until the child lost interest 
(Goodman, 1946; Ryder, 1954). The 
latter procedure, of course, makes 
imperative the use of response propor- 
tions instead of response frequencies 
as measures. 

In studying specific variables which 
would be unlikely to occur with suffi- 
cient frequency to give useful results 
under the free play procedure out- 
lined above, investigators have used 
a more directive approach in which 
the setting of the story is specified. 
The measurements may be mainly in 
terms of the dolls' actions, as when 
Levy (1933) records what the subject 
has the older doll do when it sees its 
baby sibling at the mother's breast, 
or doll play may frankly be used as 
an aid to make it easier to talk to 
children. In the doll play interview 
used by Ammons and Ammons 
(1952), the movement of the dolls is 
often only an adjunct to enable chil- 
dren to express feelings when they 
are having difficulty in verbal expres- 

sion. The same seems to be true of 
Conn's (1938) study of carsickness, 
and of Levy's (1940) and Conn's 


(1940) studies of reactions to the dis- 
covery of genital differences. 

Studies of prejudice (Goodman, 
1946; Radke & Trager, 1950) have 
confronted subjects with direct 
choices between white and Negro dolls 
to reveal their concepts of the status 
of the racial groups and their prefer- 
ences for them. In addition, Good- 
man used a story completion tech- 
nique, in which the subject decided 
which doll won in cases of conflict. 

The story completion technique is 
not necessarily restricted to the study 
of one variable. Since the completion 
of a prestructured story takes only a 
short time, a variety of situations can 
be presented, offering the advantage 
of overall scores as well as specific 
ones. Stamp (1954) and Walsh (1956) 
had their subjects complete a number 
of stories, including one free story 
which the child made up himself. 
Several other studies have used a 
combination of free play and story 
completion (Halnan, 1950; Johnson, 
1952; Winstel, 1951; Wurtz, 1957). 

D. B. Lynn (1955) has developed a 
Structured Doll Play Test (SDPT) 
which presents the child with 10 situa- 
tions in a given order, each with a 
prescribed arrangement of dolls and 
furniture. The child completes the 
story, which in some situations in- 
volves a clear-cut choice—e.g., be- 
tween bottle and cup, crib and bed, 
mother and father—thus facilitating 
objective scoring. The SDPT has 
already been used in investigating 
age and sex differences (R. Lynn, 
1955) and the effects of father ab- 
sence in Norwegian sailor families 
(Lynn & Sawrey, 1959), and an exten- 
sive program of research using the 
test is planned (Lynn & Lynn, 1959). 

Certainly the effort to get a more 
standardized procedure to insure 
comparability among studies is worth- 
while. At present, the great variety of 
materials and procedures which have 
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been employed make such compari- 
sons of unknown significance. One 
way to overcome this difficulty is to 
follow the line suggested above—i.e., 
to develop a standard procedure and 
use it throughout an extended re- 
search program. However, it is 
readily apparent that no one method 
can fit the needs of every research. 
For example, Wurtz (1957), after 
trying to use responses to incomplete 
stories as an index of guilt, concluded 
that the technique was too highly 
structured for his purposes. It seems 
that in addition to standardization 
an attempt should be made to clarify 
the effects of variations in equipment 
and procedure in order to help explain 
results already obtained, and to offer 
the prospective research worker infor- 
mation that will allow him to select 
the best conditions for his purposes. 

To a large extent, the worker cur- 
rently faced with a choice of doll play 
procedures is offered very little ad- 
vice. The best way to learn how to do 
a good doll play study still seems to 
be to collect the “lore” from someone 
with experience in the area. Some of 
this information is given as hints in 
research reports, but it is scattered 
and because of its dependence on the 
specific conditions may not be gen- 
erally useful. For example, Ammons 
(1950) found that there were signifi- 
cantly fewer refusals to respond in a 
doll play interview when simple 
alternatives were given, when the 
subject was asked what the doll 
would do rather than what it would 
say, when the items were affect- 
loaded, and when the subject was 
asked to verbalize the feelings of 
child, rather than adult, dolls. These 
kinds of “hints” will be useful to any- 
one planning to use a doll play inter- 
view with a sample similar to Am- 
mons’ (boys, aged 2-6), but we 
cannot say whether they have ap- 
Plication to free doll play or to other 


age-sex groups. Similarly, it would 
be valuable to be able to predict how 
much the child will identify the dolls 
with his own family members, but no 
attempt has been made to find ways 
of influencing this variable. Bach 
(1945) reports that among his nursery 
school subjects, any insistence by the 
experimenter that the child identify 
with the dolls led to resistance by the 
subject. Within the same approxi- 
mate age range, Despert (1940) found 
14 out of 15 subjects who made at 
least some specific identification of 
the dolls with their own families, 
while Finch (1955) reports little suc- 
cess in getting children to act out 
parental roles in relation to dolls in 
the laboratory. 

The major attempts to evaluate 
the effects of equipment and proce- 
dure have been made in research un- 
der the influence of R. R. Sears. 
Phillips (1945) found that the only 
effects of giving the subject highly 
realistic dolls and furniture rather 
than having him play with unclothed 
dolls and furniture“ of simple 
wooden blocks were increased ex- 
ploratory behavior and less time 
spent in organizing the materials. 
Pintler's (1945) study of the effect of 
organization of the equipment dis- 
closed that when the furniture and 
walls of the house were arranged in 
irregular rows instead of being or- 
ganized into rooms, children spent 
more time in organizational behavior. 
Giving the subject a doll family that 
duplicates his own has been shown to 
produce more identification with the 
dolls than does the use of a standard 
family (Robinson, 1946). In a study 
comparing yard and house settings, 
Bremer (1947) found that the use of a 
house led to more inappropriate or- 
ganizational behavior, whereas hav- 
ing the dolls placed in a yard setting 
with picnic, garage, sandbox, slide, 
and swing produced more nonstereo- 
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typed thematic fantasies, more theme 
changes, and more total aggression. 

In the investigation of the effective- 
ness of the experimenter in maintain- 
ing rapport and stimulating the child 
to elaborate themes in play within the 
experimental situation (Pintler, 1945) 
it was found that high interaction 
between experimenter and child (be- 
tween 15 and 20 of such stimulating 
interacts in 5 minutes of play) pro- 
duced more nonstereotyped fantasies, 
more theme changes, more aggression, 
and an earlier onset of aggression 
play than did a low interaction level 
(less than 5 interacts in 5 minutes). 
Studying the effect of the length and 
number of experimental sessions, 
Phillips (1945) found no differences 
between the fantasy material pro- 
duced in three 20-minute sessions and 
that in a single hour-long session. 

The sex of the experimenter also 
seems to affect results. Subjects show 
more aggression in the presence of an 
experimenter of the same sex (Caron 
& Gewirtz, 1951). 

The above summary represents 
our total substantive information 
about the effects of procedural varia- 
tions. Even our information about 
those variables that have been in- 
vestigated is limited, since most of 
them have been studied in isolation 
or in combination with only one other 
manipulated variable. Each is tied to 
the particular age group on which it 
was used, and has been tested only 
with respect to a limited number of 
dependent variables. For example, 
perhaps the most widely used refer- 
ence of those in the above discussion 
is the work of Pintler (1945). Several 
studies (Bremer, 1947; Jeffre, 1946; 
Krall, 1953; Phillips, 1945; Robinson, 
1946; Scott, 1954; Sears, 1951; Sears, 
Pintler, & Sears, 1946; Yarrow, 1948) 
used ‘‘Pintler’s high interaction level” 
or ‘‘Pintler’s low interaction level“ as 
standards which have been demon- 


strated to have certain effects. There 
is no doubt that this work is a valu- 
able contribution; however, it ap- 
pears that there is much knowledge 
still to be gained on interaction level 
before complete understanding of its 
role is reached. Pintler's study used 
only preschool children, and there are 
indications that the levels she used 
may be less successful with older 
children. For example, Simpkins 
(1948), using 5-9 year old subjects, 
tried to use Pintler's high interaction 
level, but decided that the fantasy 
material was being directed by the 
experimenter too much, so she em- 
ployed a more nondirective attitude. 
E. Z. Johnson (1951) found that 
neither of Pintler's levels was satis- 
factory for third graders, and ended 
up using an intermediate level. 

The effects of many potentially 
influential variables have never been 
studied. Thus far, all the studies that 
have varied the behavior of the ex- 
perimenter have revealed differences 
as a consequence. Since it appears 
that the experimenter is necessary to 
encourage verbalization of the sub- 
jects, his role becomes crucial. Is 
there a way to standardize the doll 
play experimenter"? Or is it more 
profitable to partial out his influence 
on the results? It seems to us that the 
answer to these questions awaits the 
demonstration of significant differ- 
ences—e.g., attributable to the 
“warmth” of the adult, a dimension 
suggested by workers in the field to 
be of importance. Only after such 
characteristics have been identified 
objectively can a decision be made as 
to how best control them. 


RELIABILITY AND VALIDITY 
Consistency of Behavior 


In this section it is not our inten- 
tion to discuss scoring reliability, 
since this form of reliability, as in all 
observational procedures, depends on 
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the explicitness of the definitions of 
the observation categories and on the 
adequacy of observers' or coders' 
training. Nevertheless, it surprised 
the reviewers how often researchers 
neglected the common research pro- 
tocol of reporting observer reliability. 
In many cases, this simple omission 
makes the appraisal of results diffi- 
cult. 

Two kinds of information exist 
about the consistency of a subject's 
behavior in doll play: the comparison 
of scores across two or more sessions 
(analogous to test-retest reliability) 
and comparisons between early and 
later portions of a single session (as in 
split-half reliability). As with so 
much of doll play data, most informa- 
tion exists for the aggression variable. 
The session-to-session correlations in 
either amount or percent aggression 
varies from .50 to .85 with a median 
intersession correlation of about .65 
(Ammons & Ammons, 1949; Gewirtz, 
1950; Levin & Sears, 1956; Sears, 
1951; Stamp, 1954; Yarrow, 1948). 
These correlations when interpreted 
against a background of varying 
observer reliabilities and session-to- 
session changes in the incidence of 
aggression (see below) indicate quite 
acceptable reliability. It should be 
kept in mind that test-retest relia- 
bilities of more highly standardized 
tests of intelligence are within the 
same range for children of this age. 

Ammons and Ammons (1949) in a 
Structured doll play situation, report 
corrected split-half reliabilities for 
aggression of .77 for the first session 
and .75 for the second. 

For other than doll play aggression, 
Bryan ( 1940) reports a more holistic 
appraisal of behavioral consistency 
in doll play, wherein a graduate 
Student matched protocols for two 
Sessions at better than chance level. 
The intersession period in Radke's 
study (1946) was 4-5 weeks. The 


consistency in specific categories 
ranged from 29% for dominant 
themes to 67% for such variables as 
attitude toward the mother. 

There are, unfortunately, too few 
reports of the consistency of behaviors 
other than aggression to appraise the 
reliability of other variables. 

Validity 

Doll play shares with expressive- 
projective techniques certain serious 
problems in the determination of 
validity. Take aggression as an ex- 
ample. Since aspects of this behavior 
are disapproved in real life and since 
doll play presumably reduces these 
social restraints it may be expected 
that children high in the inhibition of 
real aggression may be especially 
aggressive under make believe cir- 
cumstances. Were the result a sub- 
stantial negative correlation between 
real life and fantasy aggression, the 
purposes of validity would still be well 
served. However, in any group of 
children we may expect variations 
among children in the amount of 
aggression anxiety and so there may 
be no negative or positive relation- 
ships between real and fantasy ag- 
gression. 

The problem is the ubiquitous one 
of whether doll play behavior is 
replicative of real life or wish fulfilling 
in relation to real life. Bach (1945) 
estimates that more than 75% of 
children’s doll play responses is repli- 
cative and the writers’ experiences 
tend to support this contention. This 
further complicates matters because 
the validity problem would be more 
amenable to solution if a child were 
consistent in one mode or the other 
whereas he probably varies even 
within a single session. These prob- 
lems will be taken up again later. 

Against this pessimistic backdrop, 
we may inspect the validity of doll 
play behavior against the following 


34 HARRY LEVIN AND ELINOR WARDWELL 


criteria: observation of real life be- 
havior, teacher's ratings, other meas- 
uring techniques, and questioning 
the child. 

Observation of real life behavior. 
Several impressionistic accounts do 
not agree. Despert (1940) reports 
that doll play home life had ‘‘associ- 
ated emotional expressions not in all 
cases in accordance with the observa- 
tions made on their overt social be- 
havior (family or group)" (p. 25). 
By contrast, Miller and Baruch 
(1950) and Henry and Henry (1944) 
say that various types of aggression 
and sibling rivalry are congruent 
between doll play and real life. 

In a well worked-out study, Isch 
(1952) compared behavior in doll 
play during four sessions with the 
observations of mother-child inter- 
action in two half-hour sessions. The 
correlations tended to be low for 
equivalent categories—around r —.20 
—but Isch believed that fantasy 
tended to reproduce real life. For 
example, when the mothers were 
highly rejecting and highly aggressive 
the children represented the mother 
doll as aggressive. In general, aggres- 
sion was more severe in fantasy than 
in real life, e.g., burning a doll in the 
stove. 

Two other studies are relevant. 
R. R. Sears (1947), relating several 
studies, reports a complicated rela- 
tionship between aggression in nur- 
sery school and in doll play. Children 
who were least aggressive in preschool 
exhibited both extremes in doll play 
aggression, the determining factor 
being how severely the subjects were 
punished at home for aggression. 
Heinicke (1956) says that there is a 
generally good correspondence be- 
tween nursery school behavior and 
actions in doll play by 2-year-olds, a 
younger group than is usually em- 
ployed in doll play research. How- 
ever, it should be remembered that 


2-year-olds do not engage in doll 
play, in the usual sense. 

Teacher's ratings. The relationship 
between teacher's ratings and doll 
play behavior is unclear for several 
reasons. For one, the results them- 
selves are contradictory. For an- 
other, where the teacher is not rating 
actions similar to those manifested in 
doll play but is providing data for 
predicting doll play behavior, the 
findings are usually rationalizable, 
post hoc, by common sense or by one 
theoretical scheme or another. In 
line with the latter point, for example, 
Bach (1945) reports that children 
rated as compliant“ by their teacher 
had, in doll play, more fantasies about 
school, more stereotyped fantasies, 
and were less aggressive toward the 
teacher doll. These, and other find- 
ings like them, seem reasonable, but 
before we accept them as evidence of 
formal validity, the specifications for 
the rejection of the hypothesis are 
necessary. Unfortunately, the state 
of theory in personality development 
is not yet able to provide such specifi- 
cations. 

The problem of replication versus 
wish fulfillment particularly troubles 
the interpretation of the relationships 
between aggression in the classroom 
and in the fantasy situation. One 
prediction, based on a theory of dis- 
placement, is that docile children in 
the classroom will be aggressive in the 
fantasy situation, but the prediction 
must further involve the manner in 
which the child's real life aggressions 
are handled. Restrictive classrooms 
appear, for instance, to depress fan- 
tasy aggression (Levin, 1955). 

As with so much of the doll play 
data, the findings of different re- 
searchers do not agree. Bach (1945) 
found that children rated as nor- 
mally aggressive" showed less the- 
matic aggression than did either of 
the extreme groups. Isch (1952), at 
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least during the first three of four doll 
play sessions, found just the opposite. 
By the fourth session, subjects rated 
as "strongly aggressive" showed the 
most fantasy aggression. Korner 
(1949) found no relationships be- 
tween teacher ratings of hostility and 
the manifestation of hostile actions in 
doll play. Bach (1945) may have a 
resolution to this dilemma when he 
reports that there is a closer corre- 
spondence between rated and fantasy 
behavior for those children who 
"identified" with a doll—called it 
“I,” or protected it, etc. 

Two impressionistic attempts at 
validation disagree so completely that 
they do little more than confuse the 
issue. In Bryan’s study (1940) 
teachers could match complete pro- 
tocols of doll play with the appropri- 
ate children accurately only in one 
out of 20 attempts. By contrast, 
Walsh (1956) reports 90% agreement 
between doll play and teachers’ rat- 
ings on such variables as freedom of 
action, freedom and adequacy of 
emotional expression, and response 
to environmental stimuli. 

Relationship of doll play to other 
measuring techniques. The several 
studies of predictive validity give a 
generally more hopeful account of the 
validity of various doll play measures. 
Ryder (1954) reports that behavior in 
doll play agrees with that in balloon 
play and blocking; Simpkins (1948) 
found that when the Ames picture 
stories and doll play were scored on 
the same categories the agreement 
was high, although there were more 
responses—many of them nonthe- 
matic—in doll play than in the story 
Situation, Witkin et al. (1954) in a 
Study different from most using doll 
play, found that children who ex- 
hibited much organization in fantasy 
play tended to be able to resist field 
influences in perception as ascertained 


by tilting-room-tilting-chair tests and 
by rod and frame tests. 

Radke (1946) strikes the only 
dissident note in this rubric of valid- 
ity, among the authors who treat 
predictive validity. She failed to find 
relationships between doll play and 
projective picture identifications 
which she used as part of a large 
battery of measures on preschool 
children. 

Doll play and direct questioning of 
children. Many of the same factors 
which make doll play data difficult to 
understand also influence the ways in 
which children answer interview ques- 
tions, so that the relationships—or 
lack—between the two must be 
treated cautiously. The agreement in 
responses to the two questions, Which 
parent do you like best? and Which 
one does the little boy (doll) like 
best? ranges from 25% to 63%. 
When the inquiry is phrased as 
“Which doll loves other most often?“ 
the agreement between the answer 
and nonstereotyped doll play goes up 
to 68.4% (Graham, 1955). The 
closer correspondence of the second 
study is reasonably attributed to the 
likelihood that the child was reporting 
about his doll play performance 
itself rather than about the ante- 
cedents of the fantasy. 

In summary, the findings on valid- 
ity are not substantial, if by validity 
we mean the correspondence between 
doll play and nonfantasy behavior. 
On theoretical grounds, strong con- 
gruence should not be expected. 
More definitive tests of validity must 
take the form of construct validity 
which in turn waits on clear and 
unequivocal hypotheses. 


AREAS OF RESEARCH 


One of the qualities of doll play 
which has made it attractive to re- 
searchers is the flexibility with which 
it can be adapted to different content 
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areas. Modifications of equipment 
and procedure have made it possible 
to study a great variety of human 
problems “in miniature." Among the 
areas of doll play research are the 
following:  constructive-destructive 
tendencies (Ackerman, 1937), father 
fantasies of delinquent children (Bach 
& Bremer, 1947), evaluation of play 
therapy (Bixler, 1942), car sickness 
(Conn, 1938), concepts of parental 
roles (Finch, 1955), sibling rivalry in 
American (Levy, 1936) and in Pilaga 
Indian children (Henry & Henry, 
1944), aggression and aggression anx- 
iety in accident repeaters (Krall, 
1953), hostility in allergic children 
(Miller & Baruch, 1950), adult schizo- 
phrenia (Rosenzweig & Shakow, 
1937), reactions to the discovery of 
genital differences (Conn, 1940; Levy, 
1940), self-concepts of underachivers 
(Walsh, 1956), and achievement and 
work fantasies of industrious children 
(Melville, 1959). 

In fact, the number of variables 
that have been educed from doll play 
is so great that we cannot catalog all 
of them. Instead, five problems 
which have been investigated exten- 
sively will be discussed below in an 
attempt to summarize the present 
state of information about them and 
to illustrate some measuring problems 
commonly found in the use of the doll 
play technique. The areas are aggres- 
sion, stereotypy, doll preference, 
father absence, and prejudice. 


Aggression 


Far more than any other behavior, 
aggression has been investigated by 
doll play techniques. The investi- 
gator has assurance that at some 
point in the doll play procedure, a 
substantial number of children will 
evidence some aggressive acts. The 
behavior may be verbal, or acting 
out the aggression, or a combination. 
Conceptually, aggression has been 
defined as any act whose intent is to 


injure, physically or psychologically, 
another doll or equipment. Opera- 
tionally, this common definition pre- 
sents certain difficulties. First is the 
inference of intent. This part of the 
definition is designed to eliminate 
accidental aggressive acts. Since the 
child is manipulating dolls and furni- 
ture in a small space, he will from 
time to time knock over a doll or a 
piece of equipment without appar- 
ently meaning to. Investigators 
often want to ignore such fortuitous 
acts, and, in fact, it is not difficult to 
distinguish such accidental acts from 
“intended” aggression. It seems to 
us that the problem in operation is 
far less serious than is the inclusion of 
intent in the formal definition. 

Another category of events not 
covered by the definition but often 
scored as aggression is the attribution 
of motives or traits by the subject to 
a doll; e.g., the boy is bad, the 
mommy is mean. One way of han- 
dling this contingency is to include the 
subject as a scorable agent of aggres- 
sion and to count the above two 
examples as aggression from the sub- 
ject to the appropriate doll. 

In fact, a major virtue of doll play 
is the freedom it provides the investi- 
gator to design a scoring system that 
fits his problems. The many specific 
categories which have been scored 
under the general aggression rubric 
are illustrated in the middle column 
of Table 1. They include total ag- 
gression, verbal and physical aggres- 
Sion (often interpreted as indirect 
and direct), mischief, scolding, tan- 
gential, displaced, projected, etc. 
The latency of the first aggressive act 
in the session has been studied, and 
usually interpreted as an index of 
aggression anxiety. The agents and 
objects of aggressive acts are popular 
topics of study. A generalization, 
though, is that when many sub- 
categories of aggression are scored, 
the incidence in any one category is 
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SUMMARY OF STUDIES ON AGGRESSION 


Author 


Ammons & Ammons 
1953 


Measures of aggression 


Reactions to aggression 
Counter-aggression 
Leaving field 
Verbal expression 
Appeal to adult 
Inhibition of aggressive feel- 

ings 

Outcome of aggression 
Success 
Failure 

Objects of aggression 
Negro doll 
White doll 


Independent variables 


Age of subject 


Bach, 1945 


*Hostility-aggression (£5 of non- 
stereotyped) 
Agent of aggression 
*Object of aggression 


Bach, 1946 


*Total aggression (% of total 
acts) involving father doll 
*Agent of aggression 
Object of aggression 


*Sex of subject 

*Teacher ratings on aggression 
and compliance in school 

*Frustration before doll play 


*Father separation 
*Sex of subject 
Mothers' reports of their de- 
scriptions of absent fathers to 
children 


Bach & Bremer, 1947 


*Fantasy aggression (frequency) 
Killing 
justification of hostile aggres- 
sion 
Defensive rationalization 
Aggression in response to com- 
mands 
*Father doll as agent or object 


Bremer, 1947 


*Total aggression (frequency) 
Nonstereotyped aggression 
Stereotyped aggression 
Suffered aggression (express- 

ing suffering in pain) 
Chasing or escaping 
Justification of aggression 
Nonthematic aggression 
*Agents of aggression 
Objects of aggression 


Caron & Gewirtz, 1951 


*Total aggression (% of total 
acts) 
*Latency of aggression 
Projection of aggression 
Agent of aggression 
Object of aggression 


*Delinquency of subjects (home 
for prepsychopathic children) 


*Doll house setting compared to 
yard setting 
*Sex of subject 


*Sex of experimenter 
*Sex of subject 
*Age of subject 


a EE ͤ—?——— — ld 


Gewirtz, 1950 


*Total aggression (% of total 
acts) 
Direct 


*Sac and Fox Indians compared 
to white children 
Age of subject 


* Involved in relationships significant at .05 or better. 
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TABLE 1—Continued 


Author Measures of aggression 


Independent variables 


Physical injury 
Indirect 
Verbal aggression 
Discipline 
Discomfort-causing 
Agent of aggression 
Object of aggression 
Projection of aggression 
Displacement of aggression 
Response attenuation (ratio 
of direct to indirect) 


*Sex of subject 
*Session-to-session changes 


Gewirtz & Caron, 1954 
Latency of aggression 
Agent of aggression 
Object of aggression 


Physical injury (% of total acts) Sex of experimenter 


*Sex of child 
Session-to-session changes 


Hollenberg & Sperry, 
1951 

Hollenberg, 1949 

Sperry, 1949 


*Total aggression (% of total 
acts) 


Aggressive mischief, disobedi- 
ence 

Verbal discipline & aggression 
feelings—scold, threaten 
derogation 

Physical discipline 

Physical injury to person 

Physical injury to equipment 

States of uncomfortable feel- 
ing 

*Projection of aggression 
Intensity of aggression 


Holway, 1949 Total aggression (frequency) 


Jeffre, 1946 
Isch, 1952 


*Total aggression (frequency) 
*Total aggression (% of total 
acts) 
Agent of aggression 
*Object of aggression 
Johnson, 1951 *Total aggression (frequency) 
*Aggression mischief 
*Verbal aggression 
*Physical injury to per- 
son or equipment 
*States of uncomfortable feeling 
Verbal discipline 
Physical discipline 
*Agent of aggression 
*Object of aggression 


*Contra- 
social 


J'prosocia 


Frustration at home (mother in- 

terview) 

*Punishment for aggression at 
home (mother interview) 

*Sex of child 

*Session-to-session change 

*Disapproval of aggression during 
experimental session 


Mother interview data on 


(a) Strictness of feeding 
schedule 

(b) Mothers feeling tone on 
feeding schedule 


(c) A umber of months breast 
ed 
(d) Age begin toilet training 


*Teacher ratings on aggression 
*Session-to-session changes 
Observed mother-child interac- 
tion (rejection, aggression) 


*Age of subject 
*Sex of subject 
*Session-to-session changes 


TABLE 1—Continued 
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Measures of aggression 


Independent variables 


Total aggression (frequency) 
(Listing of types of aggression 
observed) 


*Total aggression (frequency) 
Action aggression 
*Verbal aggression 
Aggressive anxiety 
Differences between verbal ag- 
gression & action aggres- 
sion 
*Latency of aggression 
Inhibition of aggression 
Displaced aggression 
Projected aggression 


Levin 1953 *Total aggression (% of total 


acts) 


Levin & Sears, 1956 *Total aggression (% of total 


acts) 


acts) 


Levy, 1936 Prevention of hostility 
Direction of hostility (order of 
attack on different objects) 
Forms of hostility 

Mild 

Simple assault 

Primitive hostility 
Self-punishment & retribution 


Content of incomplete story 
Hostility ratings based on parent 
& teacher ratings 


*Accident prone compared to 
accident free children 
*Sex of subject 
Session-to-session changes 


*Sex of subject 

*Severity of punishment of aggres- 
sion at home (mother inter- 
view) 

*Session-to-session changes 

*Quarreling & fighting in class 


(teacher judgment) 
Levin, 1955 *Total aggression (% of total *Sex of subject 
acts) *Dominance-control of classroom 
teacher (observed) 


Session-to-session changes 


*Sex of subject 

*Identification with 
(mother interview) 

*Sex of usual punisher (mother 
interview) 

Severity of punishment for ag- 
gression toward parents 
(mother interview) 

*Session-to-session changes 

*Ordinal differences 

Socioeconomic status 


parent 


Levin & Turgeon, 1957 »Total aggression (% of total *Mother's presence at doll play 


session 

*Stranger's presence at doll play 
session 

*Sex of subject 


Sibling rivalry problems of sub- 
jects 
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Author Measures oſ aggression Independent variables 

Miller & Baruch, 1950 Presence- absence of Allergie compared to nonallergic 

*Direct hostility problem children 

*Indirect hostility 

Displaced hostility 

*Hostility against self 

Phillips, 1945 *Total aggression (frequency ) Realism of materials 
Session length 
*Session-to-session changes 
Pintler, 1945 *Total aggression (frequency) *Experimenter-subject interac- 
Latency of aggression tion 


*Organization of materials 
*Session-to-session changes 


Pintler, Phillips, & *Total aggression (frequency) *Sex of subject 
Sears, 1946 


Robinson, 1946 Total aggression (frequency) *Type of doll family: standard or 
Stereotyped duplicate of subject's family 
Nonstereotyped Presence or absence of sibling in 
Agent of aggression subject's family 
*Object of aggression 
Ryder, 1954 *Rating of aggressive feeling *Father separation 
Rating on inhibition of aggres- *Sex of subject 
sive feeling 
Total aggression (frequency) 
Scott, 1954 Total aggression (frequency) *Separation from parents 
*Agent of aggression 
*Object of aggression 
Sears, 1951 *Total aggression (frequency) *Sex of subject 
Nonthematic *Age of subject 
Thematic *Sibling status 
*Bodily injury *Father absence 
*No bodily injury *Session-to-session changes 
Nonpersonal aggression (by 
dolls toward nonpersonal 
objects) 


Trouble as result of de- 
mons, catastrophes, or 
imaginary characters 

*Latency of aggression 
Agent of Aggression 
*Object of aggression 


Sears & Pintler, 1947 Agent of aggression *Sex of subject 
*Object of aggression 
Content of aggression 


Sears, Pintler, & Sears, *Total aggression (frequency) *Sex of subject 
1946 Agent of aggression *Age of subject 
*Object of aggression *Father separation 
*Session-to-session changes 
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TABLE 1—Continued 


Author 


Measures of aggression 


Stamp, 1954 
(story completions) 


*Direct (% total aggression: self 
doll—parent) 

*Indirect (% total aggression: self 
doll, with implied intention; or 
parent but not by self) 

Directed—self (% total aggres- 
sion) 

*Displaced (remaining % total 
aggression) 


Independent variables 


*Teacher ratings of subjects as 
“rebellious” or “submissive” 

*Sex of subject 

Session-to-session changes 


Yarrow, 1948 


Agent of aggression 
Object of aggression 
*Total aggressive acts 
*Nonstereotyped aggression 
*Stereotyped 
*Tangential aggression 
*Latency of aggression 


*Sex of subject 
Experimentally induced frustra- 
tion, antecedent to doll play 
(a) Failure 
(b) Satiation 
*Session-to-session changes 


so small that they are combined for 
purposes of analysis into large group- 
ings such as “total aggression” or 
"direct" and "indirect" aggression, 
etc. 

The tendency to proliferate basic 
categories and to recombine them 
into various indices presents a diffi- 
cult problem for comparing and 
evaluating studies. Since a large 
number of combinatorial indices are 
possible from a few basic variables 
and since experimenters choose for 
theoretical or other reasons to form 
different combinations, studies which 
should be comparable are not. The 
evaluator is also tempted to think 
that many of the combinations and 
arithmetic manipulations of scores 
were reached post hoc and to wish for 
replications of findings. 

As illustrative of the large amount 
of research on aggression, only a few 
topics will be discussed in detail: sex 
and age influences; session-to-session 
changes; and the child rearing ante- 
cedents of total, displaced, and pro- 
Jected aggression. 

Age, sex, and aggression. The single 

est documented finding using the 
Play technique is that boys are more 


aggressive than girls. Still, in spite 
of the overwhelming evidence on this 
point there are a few contradictory or 
nonconfirmatory results. Krall(1951) 
reported more aggression among the 
girls in her sample, but a careful 
check on her data is best interpreted 
as no rather than reversed sex differ- 
ences. Likewise, Henry and Henry 
(1944) reported no sex differences in 
aggression for Pilaga Indian children, 
and Hollenberg and Sperry (1951) 
found none among Iowa City nursery 
school subjects. Since the findings 
are so overwhelmingly in the other 
direction the burden of explaining the 
dissenting results must fall on these 
few investigators. 

E. Z. Johnson (1951) adds an im- 
portant result to therepetitive boys 
more than girls" data. She found 
that boys do exceed girls in physical 
aggression, but that girls show more 
verbal aggression than do boys. This 
finding is reasonable in light of the 
findings on overt—nonfantasy—ag- 
gression. 

Johnson’s finding in regard to age 
of the subjects is also provocative. 
Younger children show more of what 
she calls "contrasocial" aggression, 
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while older children's aggression is 
more prosocial,“ usually depictions 
of the parents punishing the children. 
As one compares the 5- with the 8- 
year-olds, the usual sex difference in 
aggression decreases. Caron and 
Gewirtz (1951) confirm this finding. 
P. S. Sears (1951), on the other hand, 
found that the sexes become more 
different in this respect as they are 
older, but it must be remembered that 
her subjects ranged in age from 3 to 5, 
which is younger than the youngest 
group in the other two studies citing 
age differences in fantasy aggression. 

Session-lo-session changes in doll 
play aggression. Second only to the 
consistent finding that boys are more 
aggressive than girls, is the ubiqui- 
tous result that children are more 
aggressive in the second compared to 
the first session of doll play (e.g., 
Hollenberg & Sperry, 1951; Levin & 
Sears, 1956; Sears, 1951). Although 
the amount of aggression increases, 
children tend to maintain their rela- 
tive rank order in aggressiveness 
(Sears, 1951). The above findings 
apply to the first two sessions. When 
children participated in more than 
two sessions, aggression in the later 
sessions presented a more compli- 
cated picture. For example, Jeffre 
(1946) reported that across four 
sessions, aggression toward the ex- 
perimenter and equipment increased, 
which seems a likely reflection of a 
child’s frustrated boredom with the 
doll play task. Pintler (1945), using 
three sessions, found that the latency 
of the first aggressive act decreased 
in the later sessions. 

The increase in aggression appears 
to be related to amount of experience 
in doll play and not particularly to 
the interval between sessions. Phillips 
(1945) compared doll play perform- 
ance in a single one-hour session 
to three 20-minute sessions. The 
changes that occurred between the 
first and final thirds of the massed 


session were similar to those between 
the first and last distributed sessions. 

A reasonable explanation of this 
common finding is that the child 
learns with time that the restraints 
against the expression of aggression 
are not operative in doll play and 
hence he may vent his impulses more 
freely. The fact that when a stranger 
is introduced into a second session 
the usual increase in aggression does 
not occur lends experimental cre- 
dence to this interpretation (Levin & 
Turgeon, 1957). 

Child rearing antecedents of doll 
play aggression. The hypotheses re- 
lating certain child rearing practices 
to aggression in doll play have come 
from psychoanalytic and behavior 
theory. The setting of doll play is 
thought of as a situation relatively 
free from real life restraints and so 
appears on a similarity dimension 
with home and school, but different 
enough from the real life settings so 
that the restraints against aggressive 
expression are less potent. If, there- 
fore, aggression is punished at home, 
such actions are less likely to occur at 
the point of punishment but will be 
manifest in the safety of doll play. 
The general hypothesis has been that 
there is positive correlation between 
severity of punishment at home and 
the incidence of aggression in doll 
play. One shortcoming of the dis- 
placement hypothesis is that it does 
not predict a higher frequency of 
incidence in doll play than the less 
severely punished condition—only 
that such denied behaviors will ap- 
pear in fantasy but not in real life. A 
conflict drive hypothesis has been 
added to the original displacement 
one to cover this lack (Whiting & 
Child, 1953, p. 353). This additional 
hypothesis postulates a drive incre- 
ment due to the subject’s desire to 
express aggression and his fear of 
such expression. Since drive operates 
multiplicatively, the combination of 
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hypotheses covers the prediction of 
severe parental punishment leading 
to high fantasy aggression. 

What is the evidence for this hy- 
pothesis? Hollenberg and Sperry 
(1951) reported confirmation. In an 
earlier summary report of research 
performed under his direction, R. R. 
Sears (1947) found the predicted 
state of affairs: those children who 
were most severely punished at home 
were most aggressive in doll play. 

An attempt to replicate this finding 
further entailed fairly elaborate 
changes in the hypothesis (Levin & 
Sears, 1956). On a larger and more 
varied sample—the previous studies 
were done with university nursery 
school groups—the simple ‘‘punish- 
ment leading to aggression’’ hypothe- 
sis did not hold. Rather, doll play 
aggression was shown to be predict- 
able from a combination of the sex of 
child, the real life agent of punish- 
ment, and the nature of the child’s 
identification with his parent, as well 
as the severity of punishment. In 
general, these findings lend them- 
selves more easily to a replicative 
rather than to a displacement inter- 
pretation. Taken in the light of E. Z. 
Johnson’s (1951) finding that older 
children evidenced more prosocial 
aggression, the more mature, identi- 
fied children may be portraying the 
parental punishment that they have 
experienced. It is interesting to note 
that real life aggression among pri- 
mary school children was predictable 
from much the same variables as the 
fantasy behavior in the Levin and 
Sears’ study (Eron, 1958). 

One other study based on the dis- 
placement hypotheses obtained com- 
pletely contradictory results (Levin 
& Turgeon, 1957). The prediction 
that the presence of the mother at 
the doll play session would redinte- 
grate aspects of the home and reduce 
the freedom of doll play was not 
borne out. The opposite finding 


emerged; aggression was more fre- 
quent before the mother compared to 
the control condition. The investi- 
gators called the original hypothesis 
into question and suggested that 
there are characteristics of the doll 
play situation which make doubtful 
its use as a point on a simple freedom- 
from-inhibition dimension. 

Wurtz (1960) in a recent theoretical 
statement argued that mild aggres- 
sion anxiety should facilitate attenu- 
ated aggression in doll play. He 
found some confirmation for this 
notion in a reanalysis of earlier data 
reported by P. S. Sears (1951) and 
Sears, Pintler, and Sears (1946), 
when the index of attenuation is 
based on the use of child compared 
to adult dolls as agents and objects 
of aggression. 

In addition to thinking of total 
doll play aggression as a manifesta- 
tion of displacement, the same phe- 
nomenon has been studied within the 
doll play situation itself. If a child 
has been punished for aggression, the 
depiction of this punishment in doll 
play should arouse more anxiety than 
in cases of aggression toward a doll 
less similar to the performer. This 
conceptualization creates substantial 
difficulties. It implies that although 
doll play in general is not very in- 
hibiting, there is still sufficient anxi- 
ety to influence the choice of dolls 
that act as the objects of hostility. 
We might expect, therefore, a mild 
and not very consistent effect on the 
choice of objects of aggression. The 
unreliability should be compounded 
by the low incidence of acts which 
determine any displacement score. 
For example, if 15% of all doll play 
units are aggressive and this percent- 
age is divided among five dolls 
equally, we are dealing with expected 
displacement scores of 3% of the 
total number of acts, and the unreli- 
ability of this miniscule proportion is 
obvious. 
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The implications of the displace- 
ment hypothesis for understanding 
pro- and contrasocial aggression are 
especially difficult to justify. If 
severely punished, highly identified 
children accurately replicate their 
parents’ punishment in doll play, 
they would be showing little displace- 
ment although they have experienced 
severe punishment. 

The final comment on doll play 
analysis of displacement is, to our 
thinking, most serious and applies 
equally below to the discussion of 
projection. How are the doll agents 
or objects of aggression to be ordered 
for the analysis of the two defense 
mechanisms? Most often, the assump- 
tion is made that the child uses the 
doll most similar to himself as the 
point of origin on the similarity 
continuum. For children who are 
strongly identified with their parents, 
this assumption is suspect. Granted 
this point, however, the additional 
points create greater difficulties. 
Should the grouping be by sex or age? 
Does the dimension for a girl go: girl 
(G), mother (M), boy (B), father 
(F), baby (bb); or G, B, M, F, bb; or, 
perhaps, G, M, F, B, bb; etc.? All of 
these are empirically answerable, 
albeit difficult, questions. One pos- 
sibility is that the dimension is an 
idiosyncratic, response mediated one. 
Another tack may be that the nature 
of the dimension varies depending on 
the behavior being studied; i.e., one 
sequence of dolls for aggression, an- 
other for dependency, etc. As will be 
pointed out in the final section of this 
paper, there is little evidence that can 
be brought to bear on these questions. 

An analog to displacement within 
the doll play session is “projection,” 
which is defined in terms of the doll 
agents of aggression. Presumably, a 
doll most similar to the child carrying 
out hostile acts represents projection 
of the child's hostile impulses to the 
doll. The above comments on dis- 


placement also apply to this mecha- 
nism. 

A number of studies relate the 
agents and objects of aggression to 
demographic characteristics of the 
child, as can be seen in Table 1. How- 
ever, a direct test of the displacement - 
or projection formulations requires 
information about the nature of the 
child's aggression anxiety as well as 
the dolls he chooses to initiate and 
receive hostile acts. Only one study 
yields this information directly (Hol- 
lenberg, 1949). She found that chil- 
dren who were severely punished for 
aggression at home projected aggres- 
sion more in doll play than did less 
severely punished children. Com- 
parable data on displacement are not 
available. 

In summary, the demographic and 
practice correlates of doll play ag- 
gression are clear and substantial. 
However, the problems of greater 
theoretical interest—the child rearing 
correlates of doll play aggression— 
must, because of their conceptual 
unclarities and inconsistent results, 
remain open questions. A thorough 
test of the displacement model would 
require information about the anxi- 
ety attached to the expression of 
aggression at home, the amount of 
such behavior actually exhibited at 
home, the instigation to aggression, 
and the amount of aggression shown 
in doll play. A questionable assump- 
tion is that the instigation to aggres- 
Sion is more or less the same in doll 
play as in the home—that it is a 
characteristic of the person inde- 
pendent of the situation. No single 
study fulfills more than one or two of 
these requisites. 


Stereotypy 


Many doll play studies have cate- 
gorized routine, habitual actions, 
“doll action appropriate to the time, 
place, situation, and characters in- 
volved” (Phillips, 1945). These be- 
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haviors are most often termed "'stere- 
otyped," although they sometimes 
have been labeled ‘realistic’ or 
"routine role." Such doll actions 
usually constitute a considerable part 
of the total acts in a session. Krall 
(1953) reports that stereotyped the- 
matic responses constitute 45% of all 
responses made in doll play, and 
Bach (1945) and E. Z. Johnson 
(1951) report that 59% and 66% of 
thematic responses are stereotyped 
actions. 

'The most consistent finding with 
regard to stereotypy is a sex differ- 
ence: girls show more stereotyped 
behavior than do boys (Bach, 1945, 
1946; Bremer, 1947; Honzik, 1951; 
Pintler, Phillips, & Sears, 1946; 
Yarrow, 1948). This finding might 
be attributable to the greater famil- 
iarity of girls with doll playing, but 
since it occurs from age 3 onward, it 
would seem more likely to be related 
to greater inventiveness of young 
boys. Asa case in point, Tuddenham 
(1952) reports that first, third, and 
fifth graders recognize that the 
"typical girl" is less daring than the 
"typical boy." 

'T'he amount of stereotypy de- 
creases from session to session (Bach, 
1945; Phillips, 1945; Yarrow, 1948), 
a fact which may be explained in 
several ways. The higher incidence 
of aggression in later doll play ses- 
sions may displace stereotyped re- 
sponses. Also, the relaxation of 
restraints in the second session which 
yields more aggression may also lead 
to more nonstereotyped, nonaggres- 
sive behaviors. It seems natural that 
a child faced with a new situation 
would first represent the most highly 
practiced behaviors—the routine acts 
of the home. 

Attempts have been made to re- 
late stereotypy to adjustment. Bach 
(1945) reported that children whom 
teachers rated as being "well ad- 
justed” showed a higher rate of de- 


crease of stereotypy over sessions 
than did “poorly adjusted” children. 
Holway's (1949) findings on “real- 
istic" play, which seems to be closely 
related if not identical to stereotyped 
play, show that at the end of therapy, 
children play more realistically using 
less fantasy, aggression, or tangential 
(nondoll) play. Holway's study at- 
tempted to relate doll play to child 
rearing variables. She found that 
realistic play was positively related 
to the amount of early self-regulation 
in feeding and the number of months 
the child was breast fed. 

In Holway’s (1949) sample of 3-5 
year olds, there was no correlation 
between realistic play and either CA 
or IQ. However, Graham (1952), 
comparing seven "bright" primary 
school children with seven dull“ 
ones, found that the brighter children 
used more stereotyped responses. 

Aside from the sex differences, 
session-to-session changes, and pos- 
sible IQ differences, there have been 
no other substantial findings with 
regard to stereotyped play. In stud- 
ies of delinquents (Bach & Bremer, 
1947), accident repeaters (Krall, 
1953), and various methodological 
explorations reported above (Phillips, 
1945; Pintler, 1945; Robinson, 1946), 
no significant differences in the 
amount of stereotyped behavior were 
found between experimental and con- 
trol groups. In the area of parent 
separation, the results are not con- 
sistent—Bach (1946) found that 
father-separated children showed 
more stereotyped fantasies about 
home life, whereas Scott (1954) re- 
ported that institutionalized children 
indulged in less stereotyped play 
than did children living with their 
parents. 

The stereotype category is usually 
regarded as a residual category rather 
than as a major interest. A recent 
study indicates that it may have 
some predictive value if further anal- 
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yzed. Melville (1959) found that 
children who spend a large proportion 
of their school time working industri- 
ously use the “work routine“ category 
(that portion of stereotyped behavior 
which is work oriented) in fantasy 
more than do less industrious chil- 
dren. Note that this is a more or less 
direct replication in doll play of ob- 
served real life behavior. Melville's 
study suggests that a finer break- 
down of the stereotypy category 
might be profitable. 


Doll Preference 


Although many doll play studies 
record which dolls were used as 
agents and objects of fantasy acts, 
few of them report analysis of doll 
usage in any detail. The greatest 
interest in this variable has been evi- 
denced by researchers in the areas of 
aggression and the effects of separa- 
tion from parents, and the results are 
presented in the appropriate parts of 
this paper. 

Probably the best substantiated 

generalization to be made about this 
topic is that subjects tend to prefer 
the same sex parent doll to the parent 
doll of the opposite sex. This tend- 
ency shows some increase with age. 
The finding has not appeared in every 
study—e.g., Graham's (1952) sub- 
jects, regardless of sex, tended to use 
the mother doll more than the father 
doll—but significantly greater use of 
the opposite sex parent has not been 
reported. E. Z. Johnson (1951) 
found that while in portrayal of 
routine (stereotyped) behavior all 
subjects used the mother more often 
than the father doll, the greatest 
session-to-session increase in the use 
of the father occurred among older 
boys. In a nursery school sample 
(Sears et al., 1953), the girls used the 
mother doll more than the father, 
while the boys used the two dolls 
equally, thereby employing the father 
doll more than the girls did. 


Five studies which have used rela- 
tively structured situations to lead 
the child to make a direct choice also 
report same sex preference. Ammons 
and Ammons (1949) found a father 
preference among 3- and 4-year-old 
boys, and a mother preference among 
4- and 5-year-old girls, and R. Lynn's 
(1955) 6-year-old subjects showed a 
greater preference for the same sex 
parent doll than did her 4-year-old 
subjects. Emmerich (1959) had his 
subjects complete stories using first 
the adult and then the child dolls. 
Correspondence between the two sets 
of behaviors was taken to indicate 
high identification. He found that 
preschool children—especially boys 
—tended to identify more with par- 
ents of the same than with parents of 
the opposite sex. Similarly, highly 
sex-typed boys depict more nurtur- 
ance, punishment, and power via the 
father than via the mother doll 
(Mussen & Distler, 1959). To get at 
sex role identification, Rabban (1950) 
asked children aged 3-9 to select the 
doll that “looks most like you.” 
Starting at the age of 4, the choices 
were correct as to sex. 

Preschool children who have been 
reared permissively emphasize the 
adult dolls in their fantasy produc- 
tions (Levin, 1958). This finding may 
be interpreted in several ways: per- 
missive parents interact more with 
their children and thereby provide a 
more frequent adult model, parents 
who rear their children permissively 
permit them to explore and practice 
adult-like behaviors more than do 
nonpermissive parents, and permis- 
siveness is one of the antecedents of 
identification with parents which is 
reflected in the child’s preoccupation 
with adult actions in doll play. 


Effects of Separation from Parents 


Interest in this area grew out of 
the problems of wartime father sepa- 
ration, and the majority of studies 
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have been concerned with the ab- 
sence of the father rather than the 
mother from the home. The studies 
of father absence can conveniently be 
divided into two groups: those con- 
cerned with children currently sepa- 
rated from their fathers, and those of 
children whose fathers had been 
absent during the first year or two of 
the child's life but were living with 
the family at the time of the study. 
Bach (1946) studied children aged 
6-10 whose fathers were in the service 
abroad and had been away for 1-3 
years. He found that father-separated 
children, compared to a control 
group whose fathers were at home, 
produced fewer doll actions that in- 
volved the father doll; enacted a 
more stereotyped view of family life; 
and made the father doll more ag- 
gressive, less authoritarian, and more 
affectionate, than did the control 
group. Using a smaller group of sub- 
jects, he found that where the mother 
described the absent father to the 
children in deprecatory terms, the 
children portrayed the father as being 
more aggressive to his doll children, 
but as receiving more affection from 
them; i.e., unfavorable typing of the 
absent father seemed to produce 
ambivalent feelings in the children. 
Another study (Sears et al, 1946; 
Sears, 1951) found that nursery 
School children whose fathers were 
absent from the home did not show 
the session-to-session increase in ag- 
gression that is usually found. In 
addition, boys (but not girls) whose 
fathers were absent were less ag- 
gressive in their fantasies (Sears etal., 
1946). The father-present control 
group of boys showed most aggression 
in doll play toward the father doll 
and the boy doll (sex category), 
while the boys without fathers showed 
most aggression toward the father 
and mother dolls (age category) 
(Sears, 1951). 

Lynn and Sawrey (1959), using the 


Structured Doll Play Test, have in- 
vestigated absence of fathers in chil- 
dren of Norwegian sailor families. 
They found that girls (but not boys) 
whose fathers were gone were more 
dependent than the control children. 
However, on a measure of maturity“ 
(choice of sleeping in a crib or bed), 
boys without fathers were less ma- 
ture than boys whose fathers were at 
home. In contrast to other studies of 
father absence, this one also investi- 
gated the child's relationships with 
mother, and concluded by doll play 
and other techniques that the mothers 
of father-absent children were more 
overprotective than were control 
group mothers. 

Studies of homes where the father 
is currently absent do, then, find 
substantial results. Positive results 
have not been so easy to find in 
studies in the second group—those in 
which a previously absent father is 
present in the home at the time of the 
investigation. Halnan (1950), L. C. 
Johnson (1952), and Ryder (1954) 
performed doll play studies as part of 
the Stanford University research on 
father relations of war-born children. 
Only one difference was found be- 
tween responses of control groups and 
those of children aged 4-7 who had 
been separated from their father dur- 
ing the first 2 years of life. In Ryder's 
study the doll play of the previously 
father-separated children was rated 
as revealing more aggressive feeling. 
Since this was an inferred measure 
rated by the experimenter and an 
observer, and since measures of overt 
aggression in doll play did not show 
any significant differences in this or 
either of the other two studies, it 
must be concluded that there is little 
evidence of marked effects on the 
doll play of children temporarily 
separated from their fathers in early 
life. 

In view of the recent great interest 
in the effects on the child of separa- 
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tion from his mother, it is surprising 
that doll play has not been used to 
investigate this area. So far as is 
known, there has been no study using 
this technique with children living in 
households where the mother is ab- 
sent. However, there are two investi- 
gations of children separated from 
both parents. Heinicke (1956) stud- 
ied 2-year-olds living in residential 
nurseries because their parents were 
on vacation, sick, or having another 
child. He found results which agreed 
with observations of the subjects in 
their nursery life—e.g., they sought 
the affection of adults by crying— 
but, as has been observed before, 
most of his results were not specifi- 
cally concerned with doll play re- 
sponses. Scott (1954) studied children 
separated from their parents because 
they had been institutionalized be- 
cause of neglect, mental illness in the 
family, etc. He found that the sub- 
jects showed a much greater than 
average tendency toward ‘‘metamor- 
phosis," i.e., the subject himself acted 
as an authority figure and treated all 
the dolls as children. It is doubtful 
that this result should be attributed 
to parent separation as such; it seems 
just as reasonable to relate it to the 
effects of institutionalization. 


Reactions to Racial and Religious 
Differences 


Of the studies of children's reac- 
tions to Negro-white differences, 
several (Goodman, 1952; Graham, 
1955; Radke & Trager, 1950; Steven- 
son & Stewart, 1958) have used both 
Negro and white subjects, while one 
used only white subjects (Ammons, 
1950; Ammons & Ammons, 1953), 
and another (Clark & Clark, 1947) 
used Negro subjects exclusively. In 
this area unstructured doll play, 
compared to structured, has not 
produced meaningful results. Gra- 
ham (1955) recorded the free play of 


Negro and white subjects with both 
Negro and white dolls, but made only 
intraracial analyses of his data. There 
were no outstanding differences be- 
tween the two groups. Goodman 
(1946) used only 24 subjects in the 
part of her study involving free play, 
and found no statistically significant 
differences between Negroes and 
whites. However, she did uncover 
several trends which seem worthy of 
follow-up with a larger group—e.g., 
Negro subjects tended to assign main 
roles to white dolls, and seldom re- 
vealed positive evaluations of Negro 
dolls. In her later studies (Goodman, 
1952), where no statistical evalua- 
tions were made, she reported that 
doll play was a successful technique, 
but it is difficult to tell how much 
success is attributable to the free 
play method itself since it was used 
mainly as the introduction to a doll 
play interview. 

It has been much more common, 
and apparently more profitable, to 
use controlled methods of exploration 
like direct questioning about prefer- 
ence for dolls of different color (Clark 
& Clark, 1947; Goodman, 1952; 
Radke & Trager, 1950; Stevenson & 
Stewart, 1958), identification of race 
(Ammons, 1950; Ammons & Ammons 
1953; Clark & Clark, 1947), requiring 
the child to pair dolls which “go to- 
gether" (Goodman, 1952), pairing 
dolls with middle class or slum 
houses, and with dress-up or work 
clothes (Radke & Trager, 1950), and 
various incompleted stories which 
offer an opportunity for a doll of one 
color to “win” over a doll of another 
color (Ammons, 1950; Ammons & 
Ammons, 1953; Goodman, 1952). 

Results obtained from these tech- 
niques, sometimes used in connection 
with more extensive interviewing 
(Ammons, 1950), are in fair agree- 
ment with one another. Negro and 
white nursery school children appear 
to be well aware of racial physical 
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differences (Ammons, 1950; Clark & 
Clark, 1947; Goodman, 1952). Both 
racial groups are likely to identify 
with the white doll when asked 
“Which looks most like you?” (Clark 
& Clark, 1947; Goodman, 1946), 
although with increasing age there is 
more correct identification until at 
age 7 a slight majority of Negroes 
identify with the Negro doll (Steven- 
son & Stewart, 1958). In addition, 
some of the Negroes show either 
confusion or wish fulfillment by in- 
sisting that although they are now 
dark skinned, they had white skins as 
babies (Goodman, 1946). There have 
been consistent reports that white 
dolls are preferred esthetically by 
white children, while Negro children 
do not show a clearcut preference for 
Negro dolls, but instead may either 
choose the white doll (Clark & Clark, 
1947; Goodman, 1946), show only 
a slight preference for the Negro doll 
(Radke & Trager, 1950), or show 
reluctance to make any choice (Good- 
man, 1952). Interpretation of this 
result is not unequivocal, since it may 
reflect past experience with dolls and 
story book characters who are more 
often white than colored. More im- 
portant would seem to be Radke and 
Trager’s (1950) finding that, even 
when subjects are equated for social 
' class, 5-7 year olds of both races 
accept the idea that Negroes belong 
in poorer housing. 

While it seems to have been dem- 
onstrated clearly by the method of 
doll play that children are capable of 
making discriminations on the basis 
of color, it has not been shown that 
these discriminations are reflected in 
fantasy behavior in any consistent 
way. Ammons (1950) reported that 
white boys showed a tendency with 
increasing age to use Negro dolls as 
Scapegoats. On the other hand, 
analysis of the same data did not re- 
veal any differences in the success of 
Negro vs. white dolls in conflict— 


whichever doll the subject was using 
at the moment tended to be successful 
in aggression (Ammons & Ammons, 
1953). Stevenson and Stewart's 
(1958) Southern Negro subjects chose 
the white doll as the one with whom 
they would like to play, except at the 
oldest age level—7 years—where a 
small majority chose the Negro doll. 
Goodman (1946) found no social 
acceptability differences among sub- 
jects who had chosen the white doll 
esthetically. The white doll might be 
"prettier," but the Negro doll was 
just as acceptable as a birthday party 
guest. In further studies, Goodman's 
(1952) subjects mixed the races in- 
discriminately in free doll play. 

This lack of consistent discrimina- 
tory behavior in doll play is paralleled 
by a similar unconcern in observed 
behavior. Goodman (1946) found no 
consistent prejudice in nursery school 
behavior of her mixed racial group. 
The results of doll play are congruent 
with those of other methods in finding 
a poor correspondence between beliefs 
and the development of interracial 
behavior. The evidence on this point 


has been reviewed by Harding, 
Kutner, Prochansky, and  Chein 
(1954). 


Hartley and Schwartz (1951) de- 
scribed materials and procedures for 
studying attitudes toward religious 
groups. Subjects were given three 
doll families, each of which stands in 
front of a montage background of 
photographs, one suggesting a Jewish 
religious context, one Catholic, and 
the other a middle class home without 
any religious symbols. The investi- 
gator notices what spontaneous 
identification the subject makes of 
the backgrounds, and uses these as a 
lead-in to a doll play interview with 
the child. The only data available 
from the use of this technique are 
some protocols, but it appears to be 
easily adaptable to the analysis of 
group differences. 
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Experimental Manipulations 

It is obvious, at this point, that 
the bulk of the studies has employed 
doll play to measure naturally exist- 
ing characteristics of the subjects, 
with no attempt to influence these 
characteristics. By contrast, projec- 
tive studies of adults have recently 
used experimental variations both to 
test specific hypotheses for which 
manipulation is relevant as well as to 
ascertain the validity of the measure- 
ment (e.g., McClelland, Atkinson, 
Clark, & Lowell, 1953). 

The four experimental studies of 
doll play divide into two groups: 
either some experiences of the child 
prior to doll play or experiences dur- 
ing the course of the procedure are 
varied. In the first category, Bach 
(1945), testing a frustration-aggres- 
sion hypothesis, subjected some of his 
preschool subjects to a longer rest 
period than others just before a doll 
play session. Since a long rest was 
presumably frustrating, these chil- 
dren elaborated the rest theme in 
their fantasy output more often and 
more aggressively than did the short 
rest group. Yarrow's (1948) results 
are less clear. He had one group of 
subjects play with a difficult tinker 
toy before doll play and compared 
them to a group who were given an 
easy task. The frustrated subjects 
tended to show increased aggression, 
more tangential play, and distorted 
thematic play than the other subjects 
but the results were not statistically 
significant. When the children ex- 
perienced antecedent satiation—put- 
ting pegs into boards until they re- 


fused to continue—they gave more. 


inappropriate thematic units—e.g., 
sleeping in the kitchen. 

To test the effects of aggression 
anxiety on fantasy aggression, Sperry 
(1949) compared three groups of 
children, each of whom participated 
in four sessions. For one group the 
experimenter disapproved of the sub- 
ject’s aggression in the second session. 


The experimenter disapproved of the 
subject's aggressive acts in the second 
and third sessions for another group 
and expressed no disapproval to the 
control group. Only the group pun- 
ished in the second session decreased 
their disapproved acts in the third 
period (Hollenberg & Sperry, 1951). 

Working also with a model of ag- 
gression inhibition, Levin and 
Turgeon (1957) compared two groups 
of subjects. The first group’s second 
doll play session was observed by 
their mothers; in the other group a 
strange adult female was present. 
Mothers facilitated the children’s 
aggression whereas the stranger in- 
hibited socially disapproved acts. 

In general, doll play has suffered 
from a dearth of experimental treat- 
ment. Some experimental operations 
relevant to the variables being meas- 
ured would add to the validity of the 
method and, to judge from other 
projective techniques, would provide 
more discriminating measures of in- 
dividual differences. 


Discussion 


What can we say now about the 
doll play technique, which two dec- 
ades ago appeared so promising? 
Certainly an overall body of sensible, 
interrelated findings is not apparent. 
Where doll play was used in a con- 
nected group of studies from one 
laboratory, coherent results do ap- 
pear. Otherwise, single investigators 
performing one or two studies using 
the method occasionally report inter- 
esting results but there are almost as 
many islands of findings as there are 
investigators. One might hope that 
the common method would provide 
the links between studies, but the 
flexibility of doll play, both in proce- 
dure and scoring of variables, makes 
the connections among findings ten- 
uous. 

In the area of aggression there are 
results that have been replicated. 
However, their very redundancy 
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makes them appear trivial in com- 
parison to what might have been dis- 
covered in the years of effort. We 
may take as fact that young boys are 
more aggressive than young girls and 
that children are more aggressive in 
the second than in the first doll play 
session. Most other doll play findings 
have to be hedged with boundary 
conditions, and restrictions must be 
put on general statements. 

To understand this state of affairs 
it may be useful to review the virtues 
and shortcomings of doll play. We 
believe that the meager payoff comes 
not from the technique itself, but 
from the assumptions which underlie 
the method. First, what should any 
method of assessing personality pro- 
vide? Objectivity has not often been 
a problem in doll play so long as the 
variables are carefully defined and 
the scorers are well trained.  Reli- 
ability has been looked at both in 
terms of the consistency of behavior 
and in terms of categorizing agree- 
ment among scorers. Besides, the 
method is not heavily dependent on 
verbalization, which recommends it 
for use with young children, and it is 
interesting to them. The major diffi- 
culties appear in understanding what 
the method is measuring. 


Replication and Wish Fulfillment 


'The basic question that has influ- 
enced the understanding of doll play 
is whether the child is telling about 
events and hopes and plans which are 
available to him in his day-to-day 
world or whether his acts in this set- 
ting are otherwise unavailable. The 
criterion for identifying wish fulfilling 
fantasies is that nonfantasy expres- 
sion of the behaviors is prohibited 
and they are then expressed in fan- 
tasy. The prohibitions may be actu- 
ally imposed on the child or may 
result from natural conditions: e.g., 
his color or sex or size. Therefore, the 
Specifications for wish fulfilling fan- 
tasies are four: evidence that there 
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are in “real life” some restraints 
against the expression of the behavior 
in question, a desire for such expres- 
sion, little overt manifestation of the 
behavior, and the appearance of the 
behavior in fantasy. 

Few research studies include all of 
the requirements of the wish fulfill- 
ment model. The studies of parental 
punishment and fantasy aggression 
make certain assumptions about the 
model, but both the results and the 
assumptions must be questioned since 
no study clearly replicates another. 
For example, it is assumed that 
severe parental discipline inhibits 
overt expression, yet there is some 
evidence that punishment and overt 
aggression are positively correlated 
(Sears, Maccoby, & Levin, 1957). 

In the studies of racial identifica- 
tion and prejudice, the assumptions. 
although not usually specified, are 
often reasonable. For example, a 
substantial number of Negro children 
indicate in doll play that they want 
to play with white children (e.g., 
Clark & Clark, 1947). To take this as 
wish fulfilling behavior we need to 
know that such interracial play is not 
possible and actually does not occur. 
'These inferences may be based on 
sociological characteristics of the 
child's neighborhood, although it is 
preferable to test these assumptions 
directly. 

The inclusion of "wishes" under 
the replication rubric requires some 
explanation. If the child's wishes are 
not denied real expression, this cate- 
gory of behavior does not fit our wish 
fulfillment model. One way of think- 
ing about doll play behavior is that it 
gives the child an opportunity to 
express his current experiences and 
preoccupations. The correspondence 
between real life and fantasy need not 
be uninteresting forresearch purposes. 
In this type of fantasy the child may 
give the researcher a picture of his 
thoughts and actions which would be 
much more difficult to elicit in an in- 
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terview. Also, so far as the child's 
functioning is concerned, replicative 
fantasy may well provide him an 
opportunity to practice and develop 
skills which are transferable to his 
nonfantasy life. 

To take advantage of the wish 
fulfillment-replication distinction in 
research, it would be most helpful if a 
child consistently acted either one or 
the other type of fantasy. Unfortu- 
nately, such is probably not the case. 
A child may change his emphasis 
from session to session or may vary 
the proportions of fantasy within a 
session. The ideal condition would 
allow the researcher to categorize a 
sequence of doll play as wish fulfill- 
ment or replicative. Our current 
knowledge about children's fantasies 
preclude any such simple procedure 
although, as we suggest below, there 
may be some guides in making this 
decision within doll play itself, 

Some researchable problems 
which would aid in distinguishing 
and making use of the differences 
between replication and wish fulfill- 
ment are suggested below: 

1. Without exception in the doll 
play studies reviewed the fantasies 
have been categorized in terms of 
simple counts of units. Molar se- 
quences of behavior and units of 
interaction which are now common in 
observations of adult interaction have 
not been applied to children's fan- 
tasies. For example, if the sequence 
is "the father spanks the boy and 
then the boy hits the father" we 
might be more justified in tolerating 
the notion for future tests that this is 
a wish fulfilling episode compared to 
the "father spanks the boy and the 
boy cries." 

Likewise, doll play actions that 
indicate that inhibitions are being 
overcome may be discernible. The 
two indices that have been used are 
latency of the first aggressive act and 
the occurrence of tangential beha- 
vior. The latter may be promising if 


analyzed in a sophisticated fashion 
Tangential actions such as looking 
out the window or engaging the ex- 
perimenter in conversation which 
appear irrelevant to doll play may 
indicate a variety of states. The 
child may be bored, or unable to 
think of more actions to portray, 
or he may indeed be experiencing 
anxiety over some impulse which is at 
the threshold of experience. These 
possibilities could be studied within 
the doll play protocol. It would be 
interesting to see if the precursors to 
boredom are a sequence of redundant 
acts by the subjects. On the other 
hand, signs of disinhibition may be 
succeeded by behaviors we assume to 
be generally prohibited or have been 
specifically prohibited for the subject. 

In summary, we are saying that 
there exist in the usual doll play data, 
possibilities for more elaborate and 
potentially more profitable analyses 
than have so far been made. 

2. 'The above approach to the wish 
fulfillment and replication problem 
focuses on response measures. It is 
our belief that the study or manipula- 
tion of antecedent conditions also 
may be a fruitful tack. 

Our first suggestion is to make use 
of detailed naturalistic information. 
A log of the child’s experience for a 
day or two prior to doll play might be 
kept and the fantasy protocol com- 
pared with what we know occurred in 
the child’s life. A very detailed log is 
represented by One Boy's Day (Barker 
& Wright, 1951). Such an approach 
is clearly inductive and simply pro- 
vides a mass of data which may be 
scrutinized for simple correspond- 
ences or for more complex transforma- 
tions between real life and doll play 
fantasy. For example, one could look 
at the ways in which an objectively 
described situation is filtered through 
the child’s perceptions, and the results 
might provide clues to types of ex- 
periences which form the raw ma- 
terials for wish fulfillment compared 
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to those types of experiences which 
are replicated with a high degree of 
fidelity. 

The above naturalistic approach 
may point up variables, which, 
through experimental variation, will 
provide more substantial causal rela- 
tionships between experiences and 
fantasy. For example, will a series of 
successes followed by a failure be 
fantasied as a success or failure? Does 
strongly goal oriented action that is 
not permitted consummation appear 
in doll play as goal achieved? Can a 
child be given a set to portray either 
wish fulfilling or replicative events? 

In essence, we are suggesting that 
an experimental approach to the 
antecedents of children’s fantasies 
has been tried very little and may 
provide substantial payoff. If signifi- 
cant antecedent manipulations are 
found, and their effects are potent 
and consistent across subjects, a 
more convenient response index to 
the two types of fantasy may appear. 
A Case in point is the empirically 
derived scoring scheme for n Ach, 
which includes those categories of 
fantasy that respond consistently to 
experimental manipulation of arousal 
compared to neutral instructions. 


Nature of Instigation in Doll Play 


One of the presumed virtues of doll 
play is that the amorphousness of the 
Stimulus situation would permit wide 
expression of "person" variables. 
Consequently, the preoccupations of 
the subject would be the major de- 
terminants of his fantasy responses. 
Recently, the contribution of the 
instigating stimulus itself has re- 
ceived serious attention in projection 
theory. For example, in the TAT 
measurement of need achievement 
the pictures were found to vary in the 
degree to which they elicited achieve- 
ment imagery. 

In doll play, we get the impression 
that the situation may be too broadly 
Instigating for the purposes for which 


the technique is often used. Since 
this projective method is used to 
measure a wide variety of child be- 
haviors, it is questionable if it is an 
equally appropriate measuring device 
for all of the variables. The data 
imply quite certainly that doll play is 
a useful device for measuring fantasy 
aggression. Beyond that, the inci- 
dence of other actions which may be 
coordinated to such motivational 
systems as dependency and achieve- 
ment appear to be meagre. In other 
words, the home as the miniature 
situation is associated with so many 
kinds of behavior that the researcher 
cannot be sure that the actions in 
which he is interested will appear with 
sufficient frequency to be useful. 

We can suggest two devices to 
narrow the spectrum of instigation. 
'The first is to arrange a doll play set- 
ting which calls forth the specific 
behaviors upon which the study 
focuses. For example, several studies 
have used a school room rather than a 
house when the researcher was inter- 
ested in school related behavior 
(Bach, 1945; Melville, 1959) and one 
study used a play yard setting 
(Bremer, 1947) to study play related 
behavior. 

'The second procedure focuses doll 
play even more narrowly, and may be 
thought of as analogous to the story 
completion technique. The experi- 
menter presents the child with a 
problem and then permits the child to 
complete the action when the dolls 
are given to him.  Lynn's recent 
structured doll play test follows this 
procedure; and the studies of preju- 
dice in which the child is asked to 
make a choice of a doll is a second 
example of the focused method. 


SUMMARY 


This paper surveyed the develop- 
ment and uses of doll play as a re- 
search tool. Besides methodological 
studies the findings in five areas of 
investigation which have used doll 
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play were summarized: aggression, 
stereotypy, doll preference, effect of 
separation from parents, and preju- 
dice. Although certain groups of 
studies yield interrelated results, the 
use of this research tool has been so 
varied that the overall impression is 


of many disparate findings, in spite 
of the basic similarity in method. It 
is suggested that a conceptual diffi- 
culty underlying the studies has been 
the lack of distinction between wish 
fulfilling and replicative fantasies in 
children. 
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There is a growing number of social 
and biological scientists who feel the 
need for a comprehensive theory of 
behavior—a theory of which schizo- 
phrenia in particular, or psycho- 
pathology in general, is only one 
facet. 'The theory should be broad 
enough to encompass data from such 
apparently diverse fields as anthro- 
pology, phylogenesis, human develop- 
ment, and states of lowered conscious- 
ness. Data from all of these areas 
contribute to our understanding of 
human behavior, and it would seem 
that the law of parsimony would be 
better served if these data could be 
subsumed under the same concepts 
and interpreted in terms of a com- 
mon set of principles. 

This paper attempts to outline a 
comparative-developmentalapproach 
to schizophrenia. It is comparative 
in that it relates data from the study 
of schizophrenia to many different 
fields of inquiry. It is developmental 
insofar as it is suggested by, and 
draws its basic facts from develop- 
mental studies—the development 
from conception to birth, the develop- 
ment from childhood to adulthood, 
the development from the single- 
celled organisms to man, and from 
developmental studies of human cul- 
tures. 

For the particular organization of 
the approach to schizophrenia pre- 
sented here, the author accepts re- 
Sponsibility; the original formulation 
of the comprehensive comparative- 
developmental theory is that by 

Presently with National Analysts, In- 
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Heinz Werner (1940) and his co- 
workers at Clark University. 

Werner's comparative - develop- 
mental approach aims at viewing the 
total behavior of all organisms in 
terms of a common set of develop- 
mental principles. It is his belief that 
such an approach is fruitful in co- 
ordinating, within a single descriptive 
framework, psychological phenomena 
observed in phylogenesis, ontogenesis, 
ethnopsychology, and psychopathol- 
ogy. This paper confines itself to 
what this theoretical position has had 
to contribute to an understanding of 
schizophrenia. It attempts to indi- 
cate the comprehensiveness and heu- 
ristic value of the approach without, 
however, attempting to present an 
exhaustive review of the large body of 
relevant research. 

Behavior proceeds through given 
stages in its development. A formal 
similarity obtains between the organ- 
ization and structure of processes in 
young children, in organisms low on 
the phylogenetic scale, in human 
adults of technologically backward 
societies, and in certain states of 
lowered consciousness in educated 
normal adults of technologically ad- 
vanced societies. In order for develop- 
mental theory to encompass schizo- 
phrenic processes it requires the 
introduction of constructs which 
suggest a parallelism of various as- 
pects of schizophrenia with develop- 
mental patterns in all of these spheres 
of inquiry, but especially with de- 
velopment in childhood. To this end 
developmental theorists have intro- 
duced the concept of "regression." 
'The progression seen in the normal 
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course of development is reversed in 
pathology; thus, in schizophrenia we 
may expect to find a regression in the 
direction of greater primitivization of 
process. 

A frequently raised objection to 
developmental theory is that it seeks 
only generic similarities between 
various groups and tends to ignore 
their differences. 

Exploration of developmental the- 
ory does require seeking for system- 
atic patterns of generic similarities 
in cognitive performance among cer- 
tain groups. Thus focused on simi- 
larities, developmental theorists have 
not always taken explicit account of 
specific differences that have ap- 
peared between groups. 

'The heuristic value of such an 
approach has already been demon- 
strated by the considerable number 
of investigations that have been pro- 
voked by or conducted under the 
purview of development theory. Its 
clinical value is suggested by its 
contributions to psychodiagnostic 
testing, in particular to the scoring 
and interpretation of the Rorschach 
technique. Genetic theory does not 
question that differences exist be- 
tween the child and adult schizo- 
phrenic. It does hold that similarities 
in cognitive structure exist between 
young children and adult schizo- 
phrenics both of which are exemplifi- 
cations of an ideal construct, namely, 
developmental primitivity. 

A word now about the use of the 
term primitive“ (Werner & Kaplan, 
1956). Much of the criticism leveled 
at the use of this term is based on the 
assertion that it is moralistic in char- 
acter and thus has little place in 
scientific endeavor. No such evalua- 
tive connotation is intended. While 
primitivityꝰ is not evaluative in this 
moralistic sense, it is evaluative in 
thatit may either impede or facilitate 

attainment of certain goals or states. 
Primitivity pertains to the psycho- 
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logically prior stages of development. 
In essence the concept of primitivity 
is a theoretical construct referring to 
a kind of cognition characterized 
by developmentally early processes. 
Processes that appear early in the 
development sequence—that is, early 
in childhood, or early in the temporal 
development of an idea—are more 
primitive than those which appear 
later in the sequence. 

The term "regression" as used by 
Werner (1940) refers to the structural 
re-emergence of developmentally 
lower levels of functioning as the 
more advanced and more recently 
developed levels are disorganized. 
Regression in this sense differs in 
emphasis from the meaning given 
this term by psychoanalytic ortho- 
doxy? which focuses on impulses and 
the methods by which these are 
gratified and controlled. While psy- 
choanalysis has emphasized the func- 
tion and content of psychopathology, 
the developmental approach con- 
siders only the formal structure of psy- 
chopathological processes. 

By similarity in process between 
childhood and pathological primitivi- 
zation reference is made to structural 
similarity, not to similarity in con- 
tent. The regressed adult is, of course 
not a child; rather, similar organiza- 
tions or forms of process are identifi- 
able in both. Our interest here is not 
primarily in what children or schizo- 
phrenics think or perceive, but rather, 
how they think or perceive. Schizo- 
phrenia thus is seen as a regression in 
cognitive processes; that is, it is 
conceived as a reversal of those pat- 
terns of thinking, perceiving, and so 
on, which are encountered in the nor- 
mal course of development. Further, 
developmental theorists are not con- 


? Although Freud considered ego regression 
as well as impulse regression, many psycho- 
analytic practitioners are inclined to over- 
emphasize the latter at the expense of the 
former. 
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cerned with the nature of the condi- 
tions that have caused the regressed 
behavior or the historical antecedents 
of such conditions. Rather they 
focus on the structural or formal 
consequences of these predisposing 
experiences. 

It should be made clear that the 
psychoanalytic and the comparative- 
developmental approaches are not 
mutually exclusive; rather, they focus 
on different aspects of schizophrenia 
(Arieti, 1955). Each may be clinically 
useful and theoretically productive. 
Devoting attention in this paper to 
the structural point of view does not 
attribute less value or validity to the 
psychodynamic viewpoint. Where 
the psychodynamic approach is par- 
ticularly helpful in therapy, the 
structural approach is useful in de- 
veloping hypotheses, describing de- 
velopmental phenomena within a 
consistent framework, and—most im- 
portant to the clinician—it provides 
a gauge by which psychopathological 
states and modifications in those 
states may be assessed and under- 
Stood in terms of developmental 
criteria (Siegel, 1953). The concept 
of schizophrenia which is proposed 
here proceeds from a basic develop- 
mental principle; wherever develop- 
ment takes place it initiates in a 
globality or lack of differentiation 
and becomes increasingly more differ- 
entiated, terminating in a state of 
integration. The development of 
motor coordination may serve to 
illustrate this developmental prin- 
ciple. 

When stimulated, the newborn 
typically reacts with mass nondi- 
rected motor activity. In the normal 
Course of maturation, this mass action 
becomes more focalized and better 
directed with respect to the stimulat- 
ing agent. That is, from the total 
involvement of the whole body 
emerges a differentiated activity of 
certain parts of the body—arms, 


legs, head, etc. These now differenti- 
ated movements become integrated 
into a single smooth-flowing response 
in which all parts of the body may 
participate appropriately in achiev- 
ing a goal or solving a task. 

Now let us turn to the separate 
functions that this approach encom- 
passes. In each case the comparison 
will be made between human onto- 
genesis and schizophrenia. 


EMOTIONAL BEHAVIOR 


Ontogenetic changes in emotional 
behavior proceed along, at least, 
three continua: (a) From overt 
motor expression of emotion to in- 
creasingly more internalized experi- 
ence of emotion. Crying (Bayley, 
1932), and other motor activity de- 
creases with age. (b) From globality 
of emotional experience to greater 
differentiation (Bridges, 1932). At 
first there are only undifferentiated 
affective states of relative excitement 
or quiescence. With development 
there is greater specificity of emotion. 
For example, global negative affect 
becomes more differentiated into in- 
creasingly more subtle nuances, such 
as hate, despise, contempt, dislike, 
etc. (c) From lability of emotional 
experience to increased stability. In 
the young child there is character- 
istically momentary change in the 
nature of his emotional experiences 
and its expression (Jersild, 1939). 
What starts out as a laugh may end 
up in bitter tears or vice versa. Cry- 
ing can be quickly changed to gig- 
gling by a well intentioned and well 
placed tickle. 

In accordance with the regression 
hypothesis, in schizophrenia there is 
the expectation of a reversal in each 
of these three progressions: 

1. In the acute stage of the illness, 
before chronicity becomes manifest 


3A comprehensive survey of develop- 
mentally oriented research in childhood may 
be found in Werner (1946). 
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in affective blunting, emotion is un- 
controlled; impulse is expressed 
overtly without adequate intellec- 
tual intervention. Not only is the 
expression of affect likely to be more 
public, but there is an increase in the 
degree of motor involvement. Thus, 
the motoric hyperactivity of the ex- 
cited schizophrenic and the motoric 
hypoactivity of the chronic "burnt- 
out" schizophrenic both exhibit the 
degree to which the emotional state is 
syncretically (Werner, 1940) fused in 
its expression with the motoric sys- 
tem. Although the affective and 
motoric are never wholly independent 
(Wolff, 1943) of each other, the 
immediacy, directness, and overtness 
of this relationship tends to increase 
in schizophrenia. 

2. The increasing differentiation 
and subtlety of feelings seen in onto- 
genesis is reversed in schizophrenia. 
Clinical practice, in particular ex- 
perience with the projective tech- 
niques, reflects the dedifferentiation 
of feelings. Aggressive and sexual 
components are not infrequently 
fused into an indistinguishable whole. 
Even more striking is the blatant 
admixture of positive and negative 
impulses. 

3. Though perhaps not to the 
same degree, the emotional experi- 
ence of the acute schizophrenic is 
similar to that of the young child's in 
that it, too, is highly labile and un- 
predictable. 


PERCEPTION 


'The progression from globality to 
differentiation to integration is per- 
haps best seen in perception. For the 
neonate and very young child the vis- 
ual field is not well organized or struc- 
tured. Figure and ground, contours, 
patterns of light and shadow, move- 
ment, all merge into an undifferenti- 
ated perceptual mass, or in William 
James’ classic terminology, ca bloom- 
ing, buzzing confusion.” From this 


globality emerges stages of increas- 
ingly differentiated perception. Here 
visual patterns acquire object-prop- 
erties, with definitive contours and 
localized in three-dimensional space. 
This development then terminates in 
a stage in which these differentiated 
aspects of the perceptual field are in- 
tegrated, or synthesized, into a single 
meaningful percept (Werner, 1940). 

This developmental sequence has 
been corroborated by a number of 
experiments, the most convincing of 
which have used the Rorschach 
blots as stimulus material (Hem- 
mendinger, 1953). Use of this tech- 
nique reveals the following changes 
to take place with increasing age. 

Three-year-olds are whole-per- 
ceivers; they see few details and their 
perception is best described qualita- 
tively in terms of their undifferenti- 
ated character. Four- and 5-year- 
olds react less in terms of wholes and 
more often notice and comment on 
the parts. At 6 years another, and 
distinct, change occurs: an abrupt 
and marked increase in perceptual 
responses to the small and rarely 
noticed areas in the blots. This at- 
traction to tiny details is interpreted 
as an intensification of the develop- 
ment of differentiation. At 9 years 
begins the final phase of perceptual 
development—that of synthesis and 
integration. This final phase termi- 
nates in the appearance of predomi- 
nantly synthesizing activity. In the 
integrated whole response, the blot is 
perceptually articulated and then re- 
integrated into a well differentiated 
unified whole. 

Having considered perceptual de- 
velopment in children we would ex- 
pect, according to the regression hy- 
pothesis, a reversal of this pattern in 
schizophrenia. Further, we would 
expect that the greater the pathology 
the more immature the perception. 

Experiments, particularly those by 
Friedman (1953) and Siegel (1953), 
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reveal the following relationships in 
perceptual function between schizo- 
phrenics and children: 

With respect to the develop- 
mentally immature response, there 
exists no significant difference be- 
tween children and schizophrenics, 
and both groups differ significantly 
from normal adults. The same is true 
of the most advanced percepts. The 
integrated whole response discrimi- 
nates each of the three groups from 
each other. Thus, these findings 
justify the conclusion that schizo- 
phrenics, in some respects, respond 
perceptually in a manner similar to 
that of children, and in other aspects, 
they occupy an intermediate position 
between normal adults and children. 
This may be understood in terms of 
the hypothetical construct of regres- 
Sion. In this regard regression seems 
evident, but it is not of such a total 
nature as to completely eradicate the 
history of the individual who has 
once operated on a higher develop- 
mental level. 

Now, what may be said regarding 
the schizophrenic subtypes? There is 
little or no evidence on which to dis- 
criminate the perceptual functioning 
of the hebrephrenics and catatonics 
Írom each other, and no work has 
been done with simple schizophrenics. 
However, developmentally compar- 
ing paranoid schizophrenics with the 
combined hebrephrenic and catatonic 
group (Siegel, 1953) we find the fol- 
lowing: while the perception of para- 
noid schizophrenics is typically frac- 
tionated and fragmented with em- 
phasis on perceptual analysis, re- 
sembling the performance of children 
from 6 to 10, that of the hebrephrenic 
and catatonic schizophrenics is char- 
acteristic of the global, amorphous 
Perceptual activity of 3-5 year old 
children. 

Comparative-developmental theory 
thus permits the location of cata- 
tonics, hebrephrenics, and paranoids 


on a developmental scale. In all 
aspects of cognitive functioning, in 
addition to perception, paranoid 
schizophrenics are expected to per- 
form more like the normal adult than 
the catatonic or hebrephrenic schizo- 
phrenic. It does not, however, at- 
tempt to state the conditions which 
facilitate or inhibit the depth of re- 
gression in these diagnostic cate- 
gories. At this stage in its develop- 
ment the theory has paid relatively 
little attention to motivational as- 
pects of schizophrenia. Among clin- 
ical practitioners this conceptual vac- 
uum has been filled by psychody- 
namic theories. 

There are other aspects of percep- 
tual development and regression that 
are instructive here: 

The extreme lability that we see in 
primitive emotional behavior is also 
seen in the perceptual sphere. Those 
who have worked intensively with 
schizophrenics or with young children 
cannot avoid being impressed by the 
extreme lability of their attention. 
This, in both the child and in the 
schizophrenic, may be attributable to 
a kind of perceptual passivity in 
which competing stimuli have equal 
potential for evoking a perceptual 
response. This notion of stimuli 
equipotentiality may be useful in 
understanding the severe stimulus 
boundedness of the child and schizo- 
phrenic. 

The child is stimulus bound in that 
the stimulus must be attended to. An 
infant’s eyes must follow the hand 
that goes before it. His hand must 
grasp the object that is placed in it. 

The schizophrenic is similarly stim- 
ulus bound. Stimuli that compete 
for a perceptual response cannot be 
adequately discriminated in terms of 
their relevance to a task. Thus, the 
schizophrenic complains of a rapidly 
shifting, kaleidoscopic world. A pa- 
tient seen by the author complained 
continually that he could not attend 
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to anything for very long because 
everything and anything disrupted 
his thoughts. Apparently irrele- 
vant details demanded his atten- 
tion: a noise outside, lights passing 
by at night, an apparently random 
thought, or a bodily sensation had 
equal demand on his attention as the 
topic being discussed or the task at 
hand. This extreme interpenetration 
of the schizophrenic's attention and 
thought by apparently random stim- 
uli is a well known phenomenon and 
has been well described by Cameron 
(1939), Kasanin (1944), and others. 


LEARNING 


The developmental approach to 
learning derives from the notion 
that development is characterized by 
qualitatively different processes and 
modes of organization, rather than 
by simply quantitative variations in 
process. This approach is therefore 
in opposition to those theoretical ori- 
entations which view learning as re- 
duceable to a single process. Devel- 
opmental theory does not conceive of 
any one process as being paradig- 
matic of the whole range of human 
learning. A view which reduces all 
learning to a single process conceives 
of the adult as having available more 
response alternatives than the child. 
A genetic point of view conceives of 
the adult and child as utilizing dif- 
ferent processes which may not be 
distinguishable in terms of efficiency 
or achievement. 

Developmental theorists thus seek 
to understand the nature of human 
learning through the exploration of 
qualitatively distinct organizational 
stages. Such an exploration was un- 
dertaken in a recent study by Gold- 
man and Denny (in press). They 
presented two kinds of learning tasks 
to children 5-14 years old. Perform- 
ance in the first learning task de- 
pended on apprehending the regular 


pattern of the pre-established pro- 
gram (response to two switches in a 
right-right-left-left sequence). Per- 
formance in this task increased 
steadily with age and IQ. In the sec- 
ond task rewards were received ac- 
cording to a predetermined, random 
"probability" program in which one 
response was rewarded 2595 of the 
time and the other response was re- 
warded 75%. Performance in this 
task was essentially invariant with 
age and IQ with the trend somewhat 
favoring the younger children. Inso- 
far as these developmental curves 
were strikingly different they were in- 
terpreted as indicating that the per- 
formances on the two learning tasks 
reflected different processes. Insofar 
as the sequential, or “recursive,” 
task required an active seeking for a 
general rule for its solution, it was 
interpreted as requiring a more ad- 
vanced mode of functioning than 
that on the probability or "stochas- 
tic" task which permitted a more pas- 
sive orientation to the task in that it 
did not provide for such an easily 
generalizable solution. 

A third learning process that may 
represent the most primitive level for 
humans is classical conditioning, in 
which the stimulus is presented 
wholly at the discretion of the experi- 
menter and the response is usually of 
a physiological or reflexive nature. 
Developmental studies of classical 
conditioning suggest that conditioned 
responses can be established very 
early in life and indeed that young 
children can be more easily condi- 
tioned than older children and 
adults (Jones, 1928, 1930a, 1930b; 


Kasatkin & Levikova, 1935; Ma- 


teer, 1918; Razran, 1933, 1935). The 
developmental primitivity of clas- 
sical conditioning is further suggested 
by studies which indicate that sus- 
ceptibility to conditioning is en- 
hanced in states of lowered conscious- 
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ness (Leuba, 1940, 1941; Scott, 1930). 

Thus, at least three modes of learn- 
ing are suggested which, in the order 
from most primitive to most ad- 
vanced, are: learning by classical con- 
ditioning, stochastic learning (instru- 
mental conditioning), and recursive 
learning (problem solving). The first 
level appears to be characteristic of 
the learning of very young children 
and of infrahuman animals. Here the 
learner is a kind of passive “victim” 
of his environment in that he does 
little of an active nature to learn; 
learning, the pairing of stimuli and 
response, is imposed upon him.* The 
second learning mode is distinguished 
from the first in that the learner is 
active or "instrumental" in the learn- 
ing process, yet the learning process 
is essentially by rote. In this learn- 
ing mode young children and adults 
do equally well, as do subjects of 
varying intelligence. The third learn- 
ing mode is not only the most active 
in that there is a deliberate seeking 
for order and regularity, but there is 
a vigorous development and testing 
of solution hypotheses. This learn- 
ing mode favors older and more in- 
telligent subjects. 

With growth—phylogenetic and 
ontogenetic—classical conditioning is 
less adaptive and recedes to the back- 
ground until called upon when the 
task situation calls for no more pro- 
found level of intellection. The other 
modes of learning emerge later to bet- 
ter serve the individual’s needs. 

In schizophrenia it is proposed that 
this development is reversed, with 
Sequential learning and other forms 
of complex learning situations being 
effected most and classical condition- 
ing ascending in relative importance. 

Schizophrenics have been found to 
be more readily conditioned than 


A similar viewpoint was expressed by 
Gesell (1938), 


normals in relatively simple situa- 
tions in which the response alterna- 
tives are limited and the response re- 
flexive. This has been demonstrated 
for the knee jerk (Pfaffman & 
Schlossberg, 1936), the psychogal- 
vanic response (Mays, 1934; Ship- 
ley, 1934), and eyeblink (Spence & 
Taylor, 1953). Schizophrenics have 
also been shown to exceed neurotics 
in eyeblink conditioning (Taylor & 
Spence, 1954). However, since some 
studies have failed to demonstrate 
the greater conditionability of schizo- 
phrenics over normals (Howe, 1958; 
Paintal, 1951), the question is raised 
as to what stimulus conditions en- 
hance the establishment of the con- 
ditioned response in schizophrenics 
as compared to normals. 

In accordance with the regression 
hypothesis, the increase in suscepti- 
bility to conditioning in schizo- 
phrenia should be accompanied by a 
decrement in performance of com- 
plex tasks. By complex“ task is 
meant tasks which permit wide re- 
sponse alternatives, among which 
are many irrelevant ones, and in 
which an active role of the learner is 
required. Schizophrenics have been 
found to perform poorly relative to 
the performance of control normals 
in these complex tasks (Cameron, 
1939; Hanfmann, 1939; Hanfmann & 
Kasanin, 1942; Rapaport, 1945). 

The increased conditioning per- 
formance and the decreased perform- 
ance in complex tasks, in schizo- 
phrenia as compared to normals, has 
been interpreted by Mednick (1958) 
and other learning oriented theorists 
(e.g, Taylor & Spence, 1954) in 
terms of the effect of drive intensifi- 
cation (anxiety) on the response 
strength of the conditioned response. 
A difficulty with this type of Hullian 
interpretation is that it fails to take 
into account developmental data. 
The superior performance of children 
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and infrahuman animals relative to 
normal adults in conditioning experi- 
ments can hardly be incorporated 
within such a theoretical framework 
unless one postulates the existence of 
a heightened drive state in these more 
primitive organisms. Genetic theory 
offers the parsimonious incorporation 
of data from all of these areas within 
a single theoretical structure. 

When a stable stimulus-response 
relationship has been established the 
response may be elicited by other 
stimuli similar in some manner to 
the initial stimulus. This is stimulus 
generalization. 

The genetic principle that differ- 
entiation proceeds from an initial 
stage of globality would suggest that 
in development stimulus generaliza- 
tion would decrease.  Reiss (1946) 
found that young children tend to 
generalize readily to homophones but 
this tendency disappears at about 11 
years of age. Mednick and Lehtinen 
(1957) found that amount of stimulus 
generalization reactivity, measured 
along a visual-spatial dimension of 
similarity, was significantly greater 
for younger children (7-9 years) 
than for older children (10-12 years). 

The expectation then would be 
that in schizophrenia stimulus gen- 
eralization would be higher than in 
normals of comparable intelligence. 
A number of studies testify that this 
is so (Cameron, 1938; Garmezy, 
1952; Mednick, 1955). 


'THINKING AND LANGUAGE 


Thinking and language may be in- 
vestigated from the vantage of many 
dimensions. Three which appear to 
the author to be most central and in- 
clusive are the development from 
idiosyncrasy to consensuality of con- 
cepts, from lability to stability of 
concepts, and from contextualization 
to autonomy of concepts. 

The development from idiosyn- 


crasy to consensuality refers to the 
increasingly more public and predict- 
able thinking of which the child be- 
comes capable as he grows older 
(Pollack, 1953; Werner & Kaplan, 
1952). Thus, the agreement in the 
meaning of words among members of 
a given speech community increases 
with age. Children, in contrast to 
adults, use words in a private, highly 
individualistic manner (Hayakawa, 
1954). 

In psychopathological regression 
the development toward greater con- 
sensuality in thinking is reversed. 
Idiosyncratic thought then reduces 
the schizophrenic to virtual social 
isolation (Cameron, 1938; Goldman, 
1960). 

The second dimension is the devel- 
opment from lability to stability of 
concepts. In the young child con- 
cepts are typically labile (Pollack, 
1953). The nature of the concept 
changes rapidly and in a seemingly 
capricious manner (Eng, 1931). 

An example from performance on 
the Object Sorting Test (Rapaport, 
1945) may serve to illustrate concept 
lability. The test consists of a num- 
ber of everyday, common objects 
that are placed on a desk before the 
subject. The typical adult, when 
asked to place these objects into 
meaningful groups so that the ob- 
jects within any one group belong to- 
gether, will form objects into groups 
according to their color, or material, 
or perhaps their use. A subject may 
pick out all red objects and put them 
together, or all wooden objects, or all 
tools. Young children will frequently 
switch the relationship in a very la- 
bile manner (Reichard, Schneider, & 
Rapaport, 1944). "Thus, a young 
child will select first a red ball and 
then this is placed with a red plate, 
the two objects having redness in 
common. ‘Then a toy knife is se- 
lected because it goes on the table, 
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too, like the red plate, and then 
pliers are chosen because it is metal 
like the knife, and then a pipe be- 
cause "the workman uses the pliers 
and smokes a pipe." 

Similar chain concepts are devel- 
oped by schizophrenics in the same 
task situations. The response of a 
young schizophrenic girl in a task in- 
volving a linear schematization tech- 
nique may serve as an illustration of 
the extreme equivocality, or lability, 
of the relationship between the sym- 
bol and the meaning it symbolizes 
(Goldman, 1960). Linear schematiza- 
tion requires the subject to represent 
a word, in this case a mood term, by 
drawing a line. The subject is asked 
to draw an "angry" line, or a line that 
expresses the word "misery," and 
so on. This subject was asked to 
draw a line that represented the 
word “healthy.” She drew a series of 
different lines. When asked what 
there was in the lines she drew that 
suggested health she responded: “A 
seven upside down, lightning going 
up, the medusa, and this is the med- 
ical sign of health." While the pa- 
tient could not clarify the way in 
which all of these concepts are re- 
lated to health, the response invites 
speculation about the way each 
thought was related to the one that 
preceded it. While the experiment 
was in progress she was drinking 
7-Up and remarked that it was 
"good for you." Lightning going up 
may represent a denial of the destruc- 
tive (i.e., unhealthy) effects of light- 
ning. The medusa may be related to 
'the medical sign of health" (the 
caduceus) by clang association, or by 
the snakes which are common to 
both. 

In the extreme case concept la- 
bility may be reflected in one word or 
symbol subsuming not only different 
concepts but opposite ones. This has 
been established in dreams (Jones, 


1913), in archaic language (Freud, 
1950), and also in schizophrenia 
(Goldman, 1960). 

The equivocal nature of symbol 
meaning in childhood and in schizo- 
phrenia appears to be determined by 
the close bond between the symbol 
and some particular situation, event, 
or person with which it is associated. 
This is the third dimension—the de- 
velopment from contextualization to 
autonomy of a concept. Concepts in 
childhood are determined by per- 
sonally relevant experience (Binet, 
1916; Chodorkoff, 1952; Feifel, 1949; 
Hayakawa, 1954; Kasanin, 1944; 
Terman, 1916). A newspaper, for 
example, may be defined as “what 
the paper boy brings and you wrap 
the garbage with it” (Hayakawa, 
1954, p. 80). With growth these con- 
cepts become increasingly independ- 
ent or autonomous of these per- 
sonally meaningful contexts (Werner, 
1940; Werner & Kaplan, 1950, 1952). 

In schizophrenia we expect the re- 
verse of this development: concepts 
should become increasingly less au- 
tonomous and more contextualized. 
There is extensive evidence—clin- 
ical and experimental (Arieti, 1948; 
Baker, 1953; Cameron, 1938; Gold- 
man, 1960; Kasanin, 1944) that this 
is so. The vocabulary test perform- 
ances lend further credence to the 
statement that in comparison to nor- 
mals, schizophrenics tend to use 
words in terms of their concrete func- 
tions rather than in terms of abstract 
autonomous properties (Chodorkoff, 
1952; Feifel, 1949; Harrington, 1954; 
Yacorzyncki, 1941). 

This regression may be illustrated 
by referring again to linear schemati- 
zation. A group of schizophrenics 
were asked to represent the meaning 
of a word in a line. Then inquiry was 
made into the relationship between 
the line and the word it expressed. 
Typically, the line was justified in 
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terms of some personally relevant 
experience. For example, the word 
"gentle" was represented by a pa- 
tient as a hay stack when she replied 
to the inquiry with lying in the hay 
is gentle." Another patient drew two 
lines which she said represented the 
path taken by the hand of a mother 
"gently" caressing a child. Still a 
third patient represented the word 
"gentle" with a leaf, which "is 
‘gently’ blowing in the breeze." 
Gentleness in all of these cases is rep- 
resented by unique personal experi- 
ences and associations. Similarly, in 
the Object Sorting Test, schizo- 
phrenics are more inclined than nor- 
mals, to relate objects in a highly per- 
sonal manner—"'All of these things 
were in my mother's house" or "I 
think they are all pretty." 

'Thus, three dimensions of concepts 
are suggested. Underlying the first, 
idiosyncrasy-consensuality, is the in- 
creasing stability of concepts. A con- 
cept must be stable in reference be- 
fore it can be public, or consensual. 
Underlying, in turn, the second di- 
mension, is the contextuality-auton- 
omy dimension. If a concept has 
meaning only in terms of personal 
contexts, its reference will be as 
labile as one's personal experiences, 
and therefore not available for use as 
a vehicle for social interaction. 

'The second and third dimensions 
both reflect the developmental prog- 
ress from globality to differentiation, 
and its dedifferentiation in psycho- 
pathological regression. To the ex- 
tent that a concept is labile, or in the 
extreme, in that it encompasses op- 
posite meanings, it is undifferenti- 
ated. In schizophrenia the vehicles 
of thinking and communication be- 
come progressively dedifferentiated 
in that they, the symbol and refer- 

rent, are not related in a stable man- 
ner. With regard to contextualiza- 
tion it may be said that the more au- 


tonomous a meaning, the more it is 
differentiated from a particular con- 
text. Thus, in development there is 
progressive meaning-context differ- 
entiation, while in schizophrenia 
meaning and context are dedifferen- 
tiated. 

Normal subjects more frequently 
reflect less situational meanings and 
attempt to represent some essential 
quality of gentleness. The word 
"gentle" is typically symbolized by 
normals by a light curved line, ex- 
pressing the “soft,” light“ aspects 
of "gentle." The autonomous mean- 
ing of a word is essential in that it 
abstracts from each of the many sit- 
uations with which it is associated 
(lying in hay, mother caressing child, 
etc.), a commonity that each shares. 
The essential meaning of a concept is 
abstracted from but is relatively au- 
tonomous of concrete contexts. 


SOCIALIZATION 


In the development of social be- 
havior we again see the increasing 
differentiation out of the state of 
globality which terminates in inte- 
gration. We have little reason to be- 
lieve that in the neonate the self is 
distinguished from others. Accord- 
ing to psychoanalytic theorists the 
mother, her breast, her voice, the 
warmth of her body, the sensations 
from within the infant's own body, 
are an indistinguishable whole. With 
development, there is an increasing 
awareness of the self as an entity. 

The development in social integra- 
tion is seen in patterns of play 
(Buehler, 1935; Loomis, 1931). At 
first, young children play in isolation 
with their hands, feet, or other ob- 
jects. Later, children prefer to play 
in the presence of other children— 
not with other children, but in “‘paral- 
lel” play. Differentiation has taken 
place, with this first step toward inte- 
gration and will eventually lead to 
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genuine interpersonal interaction. 

This development toward social in- 
tegration is also seen in the increas- 
ing complexity of the social groups, 
and in their increasing stability 
(Zaluzhni, 1930). 

In schizophrenia we find similar 
processes, except in reverse. On the 
ward we can see interaction repre- 
senting all of these phases. The sus- 
picious, hostile paranoid that still 
seeks social interaction; the hal- 
lucinating, babbling chronic schizo- 
phrenic that somehow still prefers to 
hallucinate and babble in the pres- 
ence of others, although not with or 
in concert with others; and finally, 
the totally regressed isolate who with- 
draws into the social vacuum of a 
corner of the ward and devotes him- 
selí to his own bodily sensations. 


MOTOR FUNCTIONS 


One of the most striking develop- 
ments to take place in the motor 
sphere is the increase in the implicit- 
ness of motor activity. Vicarious 
movements replace overt activity in 
reasoning, problem solving is less 
vocal and more silent, motion in gen- 
eral is less gross. 

à Relative to the massive debilita- 
tion in other spheres there is rela- 
tively little motor involvement in 
Schizophrenia. It is only in the most 
severe regression that motor impair- 
ment is found, such as in catatonic 
cerea flexibilitas, and in the hyperac- 
tivity and restlessness that some- 
times characterizes the acute stage of 
Schizophrenia. In chronic schizo- 
phrenia, too, there is frequently evi- 
dence of incessant repetitive move- 
ments of head, trunk, or limbs. 
The fact that there is little motor 
involvement in schizophrenia, except 
in severe cases, is consistent with 
Hughlings Jackson's principle that 
those functions which are the latest 
to develop are the first to be impaired 
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in pathology. Since motor functions 
are amongst the first to develop in in- 
fancy, we would therefore expect im- 
pairment in this sphere to develop last. 

There are other dimensions that 
have not been considered. In each of 
those that have been discussed focus 
has been on structural similarities be- 
tween young children and schizo- 
phrenic functioning. Such similari- 
ties in process are also distinguish- 
able in primitive cultures and in 
states of lowered consciousness, such 
as dreams, drug states, and hypno- 
gogic conditions. 

A comparative-genetic approach is 
fruitful in our effort to understand 
the essential nature of schizophrenia 
because it seeks to expose process 
rather than assess achievement and 
it is an approach in which structure 
is no less important than content and 
function. 

Although a structural point of 
view has been central in the systems 
of some theorists for some time 
(Arieti, 1957; Munroe, 1955; Rapa- 
port, 1951a, 1951b), psychoana- 
lytic orthodoxy has not given suffi- 
cient attention to structural elements 
until recently. Having concerned it- 
self in its early development predomi- 
nantly with primary process, psycho- 
analysis is now turning increasingly 
more to a consideration of secondary 
process. Merton Gill (1959) has for- 
malized this emphasis of the struc- 
tural point of view in psychoanalysis. 

This more energetic psychoanalytic 
consideration of ego functions, and 
the theoretical approach that has 
been offered in this paper have a 
similar goal—the formulation of a 
comprehensive theory of human be- 
havior. Such genetic approaches re- 
mind us that in our consideration of 
the schizophrenic, oral deprivation is 
a no more significant datum than is 
the inability to conceive of square 
things in terms of their squareness. 
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THE RELIABILITY OF A RESPONSE MEASURE: 
DIFFERENTIAL RECOGNITION-THRESHOLD SCORES 


DONN BYRNE AND 
University of Texas 


The problem of reliability of meas- 
urement is a familiar one in the con- 
text of test construction and test 
evaluation. In other types of investi- 
gation, however, responses frequently 
are measured in a variety of ways by 
a variety of scoring procedures (often 
a priori ones) without evident con- 
cern about measurement. Although 
psychometric issues appear to be for- 
eign to experimental methodology, 
any specified set of stimuli may be 
conceptualized as a test and the 
quantification of subjects’ responses 
as test scores. Viewed in this way, 
such scores should be evaluated ac- 
cording to accepted standards for 
psychological tests (American Psy- 
chological Association, 1954). 

An investigation of the validity of 
a response measure is usually implied 
in the design of an experiment; relia- 
bility, however, is often ignored. 
Whether results are positive or nega- 
tive with respect to one’s hypothe- 
ses, reliability of measurement can 
assume great importance. Psycholo- 
gists have the disconcerting tendency 
to create a new methodology for each 
experiment. In work on perceptual 
defense, for example, it is difficult to 
find any two studies in which the 
same stimulus is presented in the 
same way to evoke responses which 
are quantified in the same manner. 
It seems reasonable to hypothesize 
that some of these stimuli presented 
to subjects in a particular way are 
going to yield more reliably measured 
response dimensions than others. 
With a heterogeneous methodology 
and unknown reliability coefficients, 
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it should not be surprising to find 
some degree of inconsistency across 
experiments not attributable to theo- 
retical weaknesses. Thus, generally 
positive results mixed with some neg- 
ative results may reflect differentially 
reliable "tests." Even the positive 
results are no guarantee that reliable 
measures have been employed. Mc- 
Nemar (1960) suggests several fac- 
tors which act to make published re- 
sults more likely to involve a false 
rejection of the null hypothesis than 
the .05 level of significance would 
suggest. In addition, it would seem 
logical to construct reliable measur- 
ing techniques as a preliminary step 
in experimental work rather than as 
an afterthought. 


DIFFERENTIAL RECOGNITION 
THRESHOLDS 


As an example of inadequate meas- 
urement techniques, some of the 
“new look” experiments in percep- 
tion of the past decade will be briefly 
reviewed. In studies of perceptual de- 
fense, differential recognition thresh- 
olds for emotionally toned vs. neu- 
tral stimulus material have fre- 
quently served as the dependent var- 
iable and as a measure of individual 
differences in defensiveness. 

All four types of reliability should 
be considered in utilizing a dif- 
ferential recognition-threshold score. 
First, if any subjectivity is involved 
in the scoring process, there should 
be some determination of the extent 
to which independent judges are able 
to arrive at approximately identical 
scores. Interscorer agreement is à 
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necessary, but not sufficient, condi- 
tion for reliability of measurement. 
Unfortunately, many investigators 
determine only the reliability of the 
scoring procedure rather than of the 
scores themselves. Second, if a series 
of discrete, presumably homogeneous 
responses are combined to form a 
total score, it is important to deter- 
mine the extent to which this score 
is internally consistent. Third, if the 
score is considered to be indicative of 
an enduring personality character- 
istic, itis essential to know the extent 
to which this score is stable over 
time. Fourth, if a different but theo- 
retically equivalent set of stimuli is 
employed to elicit responses, the 
equivalence of the two sets of scores 
should be determined. 

A review of perceptual defense and 
related studies which have used a dif- 
ferential recognition threshold sug- 
gests that a thorough examination of 
reliability is unusual. As might be an- 
ticipated, the reliability coefficient 
which is most frequently reported is 
that of interscorer consistency, and 
the results are generally quite good 
(Eriksen, 1951a, 1951b; Kogan, 1956; 
Lazarus, Eriksen, & Fonda, 1951; 
Stein, 1953). Internal consistency is 
less frequently investigated, and the 
reported coefficients range from good 
(Vanderplas & Blake, 1949) to medi- 
ocre (McClelland & Liberman, 1949) 
to unsatisfactory (Eriksen, 1951a, 
1951b). Holtzman and Bitter- 
man (1956) reported that perceptual 
thresholds for taboo and neutral 
words were unreliable measures; 
therefore, they were eliminated from 
a factor analytic study. An investi- 
gation of the stability of differential 
recognition thresholds over time was 
not reported in any of the studies re- 
viewed. Stein's (1953) data indicate 
that equivalent forms of the stimulus 
Which he used yielded very similar 
results, The majority of the studies 


using a differential threshold score as 
a variable report no reliability infor- 
mation (Beier & Cowen, 1953; Car- 
penter, Wiener, & Carpenter, 1956; 
Chodorkoff, 1954; Cowen & Obrist, 
1958; Greenbaum, 1956; Kissen, 
Gottesfeld, & Dicks, 1957; Kurland, 
1954; Postman & Brown, 1952; 
Smith, 1954; Spence, 1957; Wiener, 
1955; Zuckerman, 1955). 


AN UNRELIABLE SCORE 


The senior author planned to use 
the differential recognition threshold 
for hostile vs. neutral words pre- 
sented tachistoscopically as a cri- 
terion measure for a new test de- 
signed to measure repressing and sen- 
sitizing defenses. It should be con- 
fessed that the reliability investiga- 
tion was undertaken only when cer- 
tain difficulties were encountered. 

Twenty pairs of hostile and neu- 
tral words were each matched for 
length, initial letter, and frequency 
of occurrence in one million words 
according to the Thorndike-Lorge 
(1944) word count. Hostile words 
were defined as those representing be- 
havior involving the derogation, in- 
jury, or destruction of either animate 
or inanimate objects. Neutral words 
were defined as those which were not 
emotionally toned. A word was as- 
signed to either category on the basis 
of the unanimous agreement of three 
independent judges. The 40 words 
were placed on slides and arranged in 
random order. 

The slides were used with a Key- 
stone Overhead Projector equipped 
with a Flashometer. Following a 
demonstration with a neutral prac- 
tice word, each stimulus word was 
presented at 1/100, 1/75, 1/50 
1/37.5, 1/25, 1/10, and 1 second 
After each trial, the subject re 
sponded by writing down his bes 
guess as to the word presented. Sub 
jects were seen in small groups. 
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The threshold for each word con- 
sisted of the first trial on which 
that word was correctly recognized. 
Scores ranged from 1 (correct recog- 
nition at the 1/100 presentation) 
through 8 (failure to recognize the 
word on any trial). A subject's mean 
threshold on the 20 neutral words 
minus his mean threshold on the 20 
hostile words yielded a defense score. 
Presumably, a positive score would 
indicate a sensitizing reaction and a 
negative score a repressing reaction. 

Disappointing results in cross-vali- 
dating the test that was being devel- 
oped led to a belated investigation of 
the reliability of the criterion. Dif- 
ferential thresholds were obtained 
for almost 600 subjects, men and 
women enrolled in general education 
courses at San Francisco State Col- 
lege. From this total, a sample of 50 
was drawn. Because some subjec- 
tivity enters into the determination 
of the trial on which correct recogni- 
tion first occurs, the authors scored 
these protocols independently. The 
defense scores had respectable inter- 
scorer consistency as shown by a cor- 
relation of .91. 

The second type of reliability con- 


sidered was internal consistency. 
When differential thresholds are ob- 
tained, it is assumed that there is 
some response homogeneity with re- 
spect to stimulus content. In this 
study, responses to 20 of the pro- 
jected words should have been de- 
termined in part by their common 
reference to hostility, and these re- 
sponses should to some extent differ 
from those evoked by the 20 non- 
hostile words. "Therefore, split-half 
reliability was determined by divid- 
ing the hostile words into odd and 
even groups and computing the dif- 
ferential threshold scores for these 
two groups compared to their match- 
ing neutral words. The coefficient of 
internal consistency was .00. It was 
not deemed essential to apply ti^ 
Spearman-Brown correction formula. 
Thus, independent judges agreed 
about the nature of the stimulus ma- 
terial and about the scoring of the 
subjects’ responses. Nevertheless, 
the resulting scores were unreliable. 
In view of this finding and the con- 
siderations discussed earlier, it is sug- 
gested that, whenever possible, any 
study should include a report of the 
reliability of its response measures. 
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OF CLINICAL JUDGMENT” 


JOE H. WARD, JR. 
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The purpose of this discussion is to 
comment on possible misunderstand- 
ings that may arise from the discus- 
sion of relative weights presented in 
Hoffman's (1960) paper, The Para- 
morphic Representation of Clinical 
Judgment." First, the present au- 
thor certainly agrees that regression 
techniques can be quite useful in the 
study and analysis of judgments. 
Regression analysis can certainly 
play an important role in the study of 
the homogeneity of judgment policies 
among individuals and in the analysis 
of the extent to which variables con- 
tribute to judgment. 

The relative weights presented on 
page 120 in Hoffman's article may 
lead to a certain amount of misunder- 
standing about the “independent 
contribution” of a variable in the 
judgment process. Before discussing 
this point it is necessary to establish 
what is meant by the term “‘inde- 
pendent contribution" of a variable. 

Consider a set of variables, X; 
Xz, X, which all have mean 
values equal to 0. The independent 
contribution to prediction of Y of a 
single predictor, say X1, refers to the 
amount of predictive efficiency that 
the residual vector E in the vector X, 
can make when predicting the crite- 
rion Y, where the residual E refers to 
the error remaining when X; is pre- 
dicted from a least squares combina- 
tion of Xs, Xs, * * * Xx. 

That is, if: 


Xy=weXetwsXst +++ MX 


1 The research reported in this paper was 
sponsored by Personnel Laboratory, Wright 
Air Development Division, under Research 
and Development Project 7719, Task 17112. 
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where values for w: (i 2, . k) are 
"least squares" coefficients; and if: 
Y=),E+G 


where bı is determined by least 
squares and G is the error remaining 
when Y is predicted from E, then the 
idea of independent contribution 
refers to the extent to which the 
residual E can account for the crite- 
rion Y. Frequently the term inde- 
pendent contribution refers to the 
proportion of the total variance of Y 
that is accounted for by the residual 
in X 1 

A regression coefficient reflects the 
value of the independent contribution 
only when the regression coefficient 
equals zero—and then, of course, 
there is no independent contribution. 
When a particular regression coeffi- 
cient is different from zero, very little 
can be said about the independent 
contribution that the particular vari- 
able associated with the coefficient 
makes toward prediction of the inde- 
pendent variable. 

The concept of relative weight 
might lead to some confusion about 
the independent contribution of the 
corresponding predictor. Not only 
does it seem difficult to attach mean- 
ing to positive nonzero relative 


weights but it seems particularly 


difficult to interpret negative relative 
weights. 

Consider first the specific example 
that Hoffman presents on page 122, 
in which it is assumed that 701 —.400, 
793—.000, 712=.707. The solution of 
the matrix is indicated to yiel 
Bo =.800, Bos — —.566, N0.12 = 566, 
and squaring the value of Roa» We 
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obtain R%o.12=.32. Since the square 
of the correlation of a predictor with 
the judgment criterion indicates the 
proportion of variance accounted for 
when no other variables are con- 
sidered, we can observe that the 
proportion of variance accounted for 
by Predictor Number 1 is . 16. It is 
quite apparent that the second pre- 
dictor, when predicting alone, ac- 
counts for no variance in the crite- 
rion. However, let us see what its 
independent contribution is. The 
og of the least-squares combina- 
tion is .32; therefore, even though 
the second predictor accounts for no 
variance when predicting alone, its 
independent contribution is equal to 
16% (.32—.16) of the total criterion 
variance. From this it becomes ap- 
parent that, whereas the first pre- 
dictor when used alone can account 
for only 16% of the total criterion 
variance, the use of the second pre- 
dictor provides an additional 16% of 
the variance. Furthermore, we can 
see that the independent contribu- 
tion of Predictor Number 1, is 32% 
(.32—.0) of the total criterion vari- 
ance. 

Several more examples are pre- 
sented in Table 1. The columns of 


roi correlation between Pre- 
dictor 1 and the criterion 
ro: correlation between Pre- 
dictor 2 and the criterion 
ria correlation between Pre- 
dictor 1 and Predictor 2 
Bo. standardized regression co- 
efficient for Predictor 1 
Bos 7 standardized regression co- 
efficient for Predictor 2 
R%.12=squared multiple correla- 
tion resulting from predic- 
tion by Variables 1 and 2 
R201.2 N20. 12 7702 = the independ- 
ent contribution of Pre- 
dictor 1 
R%o2.1= R209. 12-7701 the independ- 
ent contribution of Pre- 
dictor 2 
Wor and wo relative weight for 
predictors (see Hoffman, 
1960, p. 122) 


Table 1 reveals some of the diffi- 
culty of using the idea of relative 
weight in the interpretation of con- 
tributions of individual variables. It 
can be observed, for example, that 
the relative weights in several differ- 
ent problems can be identical while 
the independent contributions can be 
quite different. Evidently, the con- 


Table 1 are defined as: cept of relative weight will not pro- 
TABLE 1 
EXAMPLES OF SOLUTIONS TO PREDICTOR REGRESSION PROBLEMS 
Independent Relative 
E: contributions weights 
Ne To 702 712 Ba Bor R*has 
Roas Rima Wor Wor 
11 ae 
1 .400 .000 .707 800 —.566 320 320 .160 1.000 000 
2 400 000 100 404 —.004 4162 -162 .002 | 1.000 „000 
3 .800 .000 .100 88 —.081| .646 | .646 .006 1.000  .000 
4 800 000 600 1.250 —.750 | 1.000 1.000 .360 | 1.000 „000 
5 700 .000 .700 1.373 —.961| 961 .961 471 1.000  .000 
6 200 .700 .800 | —1.000 1.500 850 .360 .810 | —.235 1.235 
7100 800 500 — 400 1.000 .760 | .120 750 —.053 1.053 
8 | 998 000 070 1.002  —-702 | 1.000 | 1.000 .002 | 1.000 000 
peace 2 707 000 .707 1.4144 1.000 | 1.000 | 1.000 500 1.000  .000 
. 10 070 .000 .998| 14.29 —14.27 1.000 | 1.000 .995 | 1.000 000 


76 JOE H. WARD, JR. 


vide any information about the 
independent contribution of a pre- 
dictor. In general, the relative weight 
will not reveal anything about the 
independent contributions of the 
predictors except in the special case 
of orthogonality among the predic- 
tors. 
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The term "independent," like so 
many others in the jargon, is cursed 
by having acquired a number of 
meanings. Perhaps its usage could be 
restricted to imply only experimental 
independence, and the term “‘orthog- 
onal' to convey the notion of sta- 
tistical independence. Then both 
Ward (1962) and I could be wrong 
together. But as things stand, I be- 
lieve that the differences between us 
are mostly semantic, and therefore 
minor. 

Ward's independent contribu- 
tion“ indicates the proportion of 
variance in the criterion attributed 
to the residual in a predictor, Xi, 
after variance common to X; and 
other predictors is removed. Such 
coefficients are heavily dependent 
upon the interrelations among the 
variables included for analysis. Pre- 
cisely, the independent contribution 
of X; will necessarily be reduced by 
another predictor, X» and by an 
amount equal to the variance com- 
mon to X;, X», and the criterion, X». 

Let roa.) be the independent con- 
tribution of X; before the inclusion of 
X» and rige) be its reduced inde- 
pendent contribution. Then if we 
express the amount of this reduction 
by 0 and the percent reduction by 
6—60/r*4a), it can be shown that: 


0— rer — N20 (1,2) 
j= 7591-1203 — R0,29) 
7201 
This investigation was supported by a 
United States Public Health Service Research 


Grant (M2097-CI) from the National Insti- 
tute of Mental Health. 


It follows that although X; be a 
very satisfactory test of high validity, 
it will yield no independent contribu- 
tion if all of its valid variance may be 
predicted from a linear combination 
of a set of other predictors. This 
of course holds even if the set of 
other predictors contains no variable 
which is individually as predictive as 
X; Despite this limitation, coeffi- 
cients of independent contribution 
are quite useful, especially in empiri- 
cal prediction studies. 

The problem of assessing the rela- 
tive contribution of a variable, i.e., 
the relative importance of that varia- 
able as compared with others in- 
cluded with it in the same set, is 
different from the problem of predic- 
tion. It is different because the 
primary concern is that of determin- 
ing some mathematical representa- 
tion of relative importance. This 
may be achieved through some kind 
of partitioning of the criterion vari- 
ance. 

The variance of predicted scores 
may be partitioned in many ways, 
but few are psychologically meaning- 
ful. Apportioning it among beta 
coefficients or squared beta coeffi- 
cients is not meaningful, since not all 
of the predictable variance is ac- 
counted for. Thus, in the two- 
predictor case, using McNemar's 
(1955) notation, the variance of 
predicted scores c'z, may be ex- 
pressed as follows: 


o?z B B 28282728 


When 7:750, the squared betas 
cannot account for the predictable 
criterion variance exclusively in 
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terms of the independent contribu- 
tions of the predictors for the simple 
reason that there exists a joint con- 
tribution as well. This joint contri- 
bution may be more than modest, 
particularly where the number of 
predictors is large and their inter- 
correlations at least moderate. Beta 
coefficients are therefore inadequate 
simply because no linear combina- 
tion of beta coefficients or of their 
squares exists which will unambigu- 
ously account for the predictable 
variance of judgments. The same 
may be said of Ward's independent 
contribution coefficients. The only 
exception is the special case in which 
the predictors are orthogonal. On 
the other hand, relative weight: 
defined as: 


Wei = 


ELE 


provides a means of portraying the 
relative contributions of each of the 
predictors such that a simple sum of 
them accounts entirely and unam- 
biguously for the predictable vari- 
ance. Fora contrary point of view on 
this problem the reader may wish to 
refer to Ezekiel (1930). 

The concept of “relative weight" 
was developed to provide a means by 
which the cognitive processes of 
clinicians (and, for that matter, any- 
one making judgments or decisions) 
might be described. It should be 
noted in this connection that the 
problem of describing the judgment 
process differs from the problem of 
prediction in another way. Judgment 
studies of the type described in my 
previous Psychological Bulletin ar- 
ticle (Hoffman, 1960) deal with a 
system of variables which is finite or 
"closed." Since in the experimental 
arrangement by which the judgments 
are obtained only known quantitative 


information is available to the judge, 
the criterion variance (variance of 
judgments) must be completely ac- 
counted for by two factors: one of 
these involves some combination of 
the predictor information, not neces- 
sarily linear, perhaps quite complex; 
the other factor is chance. This being 
the case, it is meaningful to speak of 
the possibility of measuring the de- 
gree to which the criterion variance 
may be accounted for by the relation 
of one variable to the others avail- 
able to the judge, i.e., within the 
system but completely independent 
of external variables which might be 
thrown into the regression analysis 
at will. 

A fair test of a coefficient which 
presumably reflects the relative con- 
tribution of a predictor in the judg- 
ment process is one which compares 
the value of the obtained coefficient 
when the predictor is a member of the 
set available to the judge with its 
value when the predictor is absent 
but yet included in the multiple 
regression analysis. A predictor 
available to the judge and “used” by 
him should be capable of being de- 
scribed by a coefficient which has at 
least a moderate value. When this 
predictor is experimentally absent 
from the judgment situation, the 
value of the coefficient should drop 
to a chance level. A poor (in this 
sense) type of coefficient would be 
one which is affected little by this 
kind of manipulation. 

The coefficient which Ward refers 
to as independent contribution will 
not ordinarily satisfy this test. The 
introduction of an external predictor 
into such a closed system reduces the 
independent contributions of the 
internal predictors since variance 
which is common to an original 
predictor, an external predictor, and 
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the criterion is subtracted. Not 
everything is bad, however. The 
independent contribution of such an 
external predictor may be expected 
to be no greater than chance, and 
this will assuredly be reflected in the 
near-zero values of the coefficient. 
But the values of the coefficients for 
the original predictors will neverthe- 
less be determined by the interrela- 
tionships among the variables in- 
cluded in the analysis. The coeffi- 
cient of independent contribution is 
therefore unsuitable for this purpose. 
Beta coefficients are likewise affected 
by such manipulations although not 
to such a great extent. On the other 
hand, relative weights pass this test 
with flying colors, as will be shown in 
a forthcoming article. 

Another point raised by Ward has 
to do with the relationship between 
relative weights and coefficients of 
independent contribution. Although 
the concepts are different, I do not 
completely agree that no relation- 
ship exists between relative weights 
and independent contributions. It 
may be shown, for example, that in 
the case of two predictors, multiply- 
ing the ratio of the two independent 
contributions by the ratio of the 
corresponding squared validity co- 
efficients yields the ratio of the 
squared relative weights. That is: 


raus ra 


It is of interest also to note that the 
ratio of the independent contribu- 
' tions is equal to the ratio of the 
Squared beta coefficients. Multiply- 
ing Equation 2 by %1/B%o2 we obtain: 


7550.3. 720187201 
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and since, from Equation 1: 
7, %% woi RU 4) 
it follows that: 
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In contrast, relative weights may 
be said to represent a product of two 
proportions: the first is the propor- 
tion that the independent contribu- 
tion bears to the residual of the 
predictor when the joint effects of 
the other predictors have been re- 
moved. This term is identically 
equal to the squared beta coefficient. 
The second term is the proportion of 
total predictable variance in the 
criterion which is common to the 
predictor in question. 

To conclude, my usage of the 
phrase independent contribution is 
intended to connote that the variance 
of predicted scores may be success- 
fully partitioned into a simple sum of 
ingredients, each referring to a spe- 
cific predictor and each being inde- 
pendent of any joint effect or inter- 
action. What Ward means by the 
term independent contribution is 
more ordinarily known as the “part 
correlation” or, actually, the square 
of the part correlation (cf. DuBois, 
1957). Using the term in Ward's 
sense, I thoroughly agree with him 
that the concept of relative weight 
provides little information about the 
independent contribution of a pre- 
dictor. It was not intended to pro- 
vide such information. It is also 
correct that the concept of independ- 
ent contribution provides little in- 
formation about the contribution of 
a predictor relative to the contribu- 
tions of other predictors in a given 
set, nor is it intended as a method for 
assessing this aspect of the judgment 
process. 
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In the article by P. L. Broadhurst and J. L. Jinks in the September 1961 
issue, the second equation at the top of the second column on page 338 should 
read: 

[h] =F, —4F, LIS 1P,— 1P: --2Bi +2B, 
The first equation in the second column on page 353 should read: 
[h] — [i] - F1 — 4(P. --P)) 


In the article by Charles S. Morrill in the September 1961 issue, the refer- 
ences by R. E. Silverman were published by the United States Naval Train- 
ing Device Center. The reference by N. A. Crowder, 1959b, was published in 
1955. 


In the article by Mark R. Rosenzweig in the September 1961 issue, the 
quotation from Ades and Brookhart on page 384 should read as follows: - 
"that the inferior colliculus with its strong commissural connections and 
connections to efferent [not afferent] mechanisms may be the principal device 
responsible for localization" (1950, p. 203). 
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TECHNIQUES FOR THE STUDY OF LEARNING 
IN ANIMALS: 


ANALYSIS AND CLASSIFICATION! 


M. E. BITTERMAN 
Bryn Mawr College 


Although many different tech- 
niques for the study of learning in 
animals have been developed in the 
60 years or so since the problem of 
animal intelligence first was brought 
into the laboratory, their interrela- 
tions never have been carefully de- 
fined. Crude dichotomies have been 
proposed—"'respondent condition- 
ing" versus “operant conditioning" 
by Skinner (1935, 1937), „classical 
conditioning“ versus instrumental 
conditioning" by Hilgard and Marquis 
(1940)—and, more recently, a tri- 
chotomy—'‘classical conditioning“ 
versus "instrumental conditioning" 
versus "selective learning“ by Spence 
(1956)—but the diversity of method 
is too great to be encompassed in 

any such one-way analysis. While 
certain differences among the tech- 
niques to be classified must be ignored 
if the number of categories is to be 
smaller than the number of tech- 
niques, the quest for parsimony seems 
to have been carried too far. 

Classification is not merely a mat- 
ter of taste. When one can find no 
E E grows out of a program of 
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objective basis for evaluating the 
conviction that a given difference in 
technique should be stressed or that 
another safely may be disregarded, it 
is only because the proper experi- 
ments have not been performed. Con- 
sider, for example, the question of 
whether the difference between flex- 
ion conditioning with avoidable as 
compared with unavoidable shock 
should be reflected in a classification 
of techniques. The answer is Ves“ 
for Hilgard and Marquis, who empha- 
size the contingency of reinforcement 
on response. They classify flexion 
conditioning as classical“ when 
shock is unavoidable and as "in- 
strumental" when shock is avoid- 
able. The answer is "No" for Spence, 
who emphasizes the degree of control 
afforded the experimenter over the 
appearance of the response to be 
learned. Ignoring the contingency of 
shock upon failure of response to the 
CS, Spence treats avoidance condi- 
tioning as a special case of classical 
conditioning in which the pattern of 
reinforcement gradually shifts from 
consistent to intermittent. Such a 
disagreement surely need not remain 
long in the realm of opinion. One 
has only to compare the behavior of 
an animal trained with avoidable 
shock and that of a control animal 
trained with shock that is unavoid- 
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able but simply withheld on which- 
ever trials the first animal avoids; if 
response contingency is unimportant, 
the course of learning in the two ani- 
mals should be the same. 

In general, a classification of tech- 
niques may be treated as the expres- 
sion of a set of hypotheses about the 
functional significance of differences 
in technique—a distinction between 
two techniques as an assertion that 
they yield results which differ in 
some fundamental respect, and a 
failure to distinguish between two 
techniques as an assertion that they 
may be used interchangeably in the 
analysis of learning. This is not to 
say that a classification may not be 
preferred on historical, or on peda- 
gogical, or even on esthetic grounds, 
but only that a functional interpreta- 
tion is available which provides a 
basis for empirical evaluation. Meth- 
odological and functional considera- 
tions have, in fact, been linked rather 
closely in the past. Methodological 
distinctions have been taken as 
points of departure for dual-process 
analyses of learning, while strivings 
for a unitary conception have been 
reflected in the blurring of methodo- 
logical distinctions. One may even 
point to experiments designed ex- 
plicitly to provide a functional com- 
parison of different methods (Youtz, 
1938a, 1938b, 1939), although the 
empirical study of methodological 
interrelations certainly has not been 
carried very far. 

Functional considerations play a 
central role in the classification to 
be offered here, which grows out of a 
program of comparative research 
(Bitterman, 1960). The first step in 
the program is to assess the phyletic 
generality of certain theoretically 
significant phenomena of learning 
which have been established in work 
with the rat (hitherto the principal 
subject of research on learning), and 
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to that end a variety of simple ani- 
mals must be studied under condi- 
tions analogous to those which have 
been used for the study of the rat; 
but what are "analogous" condi- 
tions? Clearly, the answer to this 
question requires some hypotheses 
about the essential properties of the 
various techniques which have been 
used for the rat. As will later be 
indicated, the comparative enter- 
prise not only motivates further 
methodological analysis but consti- 
tutes a new source of data in terms of 
which the outcome may be evalu- 
ated. 


THORNDIKIAN SITUATIONS 


It seems reasonable to begin the 
analysis with a set of closely inter- 
related techniques which date back 
to the turn of the century and which 
have yielded most of the information 
on which contemporary conceptions 
of animal learning are based. The 
adjective Thorndikian is appropriate 
both because of Thorndike’s pioneer- 
ing role in their development and 
because their operation is predicated 
on an empirical law of effect. Fa- 
miliar examples are the problem box 
and the maze. In each of these situa- 
tions, traditionally, the experimenter 
sets out to change behavior by manip- 
ulating its consequences, that is, by 
arranging a contingency between 
some motivationally significant state 
of affairs ("reinforcement") and the 
behavior in question. Thus, pulling 
a loop in a problem box or turning to 
the left in a T maze may be en- 
couraged with food or discouraged 
with shock. Indeed, the motivational 
significance of any event may be 
assessed in terms of its effect on the 
response which produces it in such a 
situation. An event that facilitates 
the occurrence of a response upon 
which it is contingent is called a 
reward; an event that has the op- 
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posite effect is called a punishment; 
while an event that produces no 
measurable change in behavior is 
motivationally insignificant or neu- 
tral. An aversive stimulus is one 
whose onset is punishing, and in 
what is called escape training the off- 
set of such a stimulus serves as a 
reward. 


Unitary and Choice Situations 


An important distinction between 
two main types of Thorndikian situ- 
ation may be illustrated by a com- 
parison of the problem box and the 
maze. In both these apparatuses, the 
animal is afforded numerous possi- 
bilities for action, one of which the 
experimenter chooses to reward. The 
main difference between them has to 
do with the treatment of irrelevant 
responses. In work with the prob- 
lem box, the experimenter may take 
some qualitative notice of the variety 
of fruitless activities which appear, 
but his interest is centered on the 
rewarded response and the readiness 
with which it comes to expression. 
The basic datum is time. In the 
maze, by contrast, the unrewarded 
behavior of the animal is structured 
more clearly; certain major alterna- 
tives to correct response are deline- 
ated, and the interest of the experi- 
menter is centered on their decline 
and disappearance. The basic datum 
is error. Time may be recorded, but 
it does not as clearly reflect progress 
in the choice among alternative 
courses of action, the aspect of selec- 
tive learning which the maze is so 
well suited to display. 

The designation unitary Thorn- 
dikian situation (or T-1 situation) 
will be used here for the problem 
box and for any other Thorndikian 
situation in which but a single course 
of action is defined and the readiness 
with which it comes to expression is 
measured. The designation Thorn- 


dikian choice situation (or T-2 situa- 
tion) will be used for the maze and 
for any other Thorndikian situation 
in which two or more incompatible 
courses of action are defined and 
choice among them is studied. The 
nature of the responses delineated 
and the general properties of the en- 
vironments in which they appear are 
ignored in this classification. Thus, 
a problem box which offers a choice 
of manipulanda is classed with the 
maze as a T-2 situation, while the 
runway is classed as a T-1 situation 
despite its structural resemblance to 
the maze. The runway may, of 
course, provide a measure of error, 
as in the early works of Hicks (1911), 
who plotted the learning of a cul- 
less maze in terms of retracing, while 
the potentialities of the maze for 
the study of choice may be ignored, 
as in the early work of Thorndike 
(1898), who, measuring only time, 
used the maze as though it were 
just another problem box. In such 
cases, the classification is based on the 
use to which the apparatus actually is 
put in a given experiment. For the 
most part, however, contradictions 
between potentiality and use are rare. 
An investigator interested in choice 
among alternative courses of action 
is not likely to use a runway, nor, un- 
less he is interested specifically in 
choice among alternative courses of 
action, is he likely (today) to use a 
maze. 

Both T-1 and T-2 situations may 
be “chained.” The most common 
example of a chained T-2 situation 
is the maze of many choice-points, 
once very much the mode, but rarely 
encountered today, perhaps because 
of the conviction, expressed by 
Lashley (1918), that the single-unit 
maze is quite as sensitive as the 
multiple-unit maze to the effects of 
significant variables and much less 
costly in time and effort. (The two 
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kinds of apparatus are not, of course, 
fully equivalent; certain problems— 
such as that of correction versus non- 
correction, first studied by Lashley— 
arise only when the number of 
choice points is reduced to one, while 
other problems—such as that of serial 
order—disappear.) Chained T-1 
situations never have been widely 
used. An example may be found ina 
string of problem boxes, each present- 
ing one manipulandum, with the first 
giving access to the second, the 
second to the third, and so on, until 
the reward finally is attained (Her- 
bert & Arnold, 1947). For certain 
purposes, conceivably, mixed chains 
(composed both of T-1 and of T-2 
units) might be used. 


Generalized and Discriminative 
Situations 


Each of the two types of Thorn- 

dikian situation already distinguished 
—unitary and choice—may occur in 
discriminative as well as in general- 
ized form. This new distinction, 
which is orthogonal to the first, will 
be conveyed by adding the letters g 
(for generalized) and d (for discrim- 
inative) to the symbols for unitary 
and choice: T-1g, T-1d, T-2g, T-2d. 
In a discriminative problem, the ex- 
perimental environment is varied 
systematically from trial to trial, and 
with it the consequences of response, 
the capacity of the animal to dis- 
criminate the change being inferred 
from a corresponding variation in be- 
havior. In a generalized problem, 
there may be some variation in the 
experimental environment from trial 
to trial (intentional or unintentional), 
and there may be some variation in 
the consequences of response (as in 
work on partial reinforcement), but 
there is (by definition) no correlation 
between the two kinds of change, and 
hence there is no objective basis for 
systematic variation in behavior. 


In the simplest T-1d case, a single 
defined response is rewarded under 
one set of conditions but not re- 
warded (or punished) under another 
set of conditions, and the readiness 
with which the response comes to ex- 
pression under the two conditions is 
compared. For example, response in 
asingle-window jumping apparatus is 
rewarded when a white card is dis- 
played but punished when the card 
displayed is black (Solomon, 1943). 
Performance in a T-1d problem may 
be expressed in terms of "error," but 
a temporal criterion is implied. For 
example, in an early experiment by 
Thorndike (1898), cats were fed for 
climbing to the top of their cage in 
response to the words “I must feed 
those cats," but not for making the 
same response to the words ‘‘Tomor- 
row is Tuesday," an error being re- 
corded whenever they climbed up 
(promptly) to the second phrase or 
failed (in a reasonable period of 
time) to climb up in response to the 
first. Similarly, Grice (1949), work- 
ing with another T-1d situation, 
computed the median response time 
for a series of trials and counted as an 
error any response to the negative 
stimulus faster than the median or 
any response to the positive stimulus 
slower than the median. A clear dis- 
tinction should be made between 
error thus defined and erroneous 
choice in a T-2d situation. 

In a T-2d problem, two or more 
alternative responses are defined— 
two in the simplest case. One of the 
responses is rewarded and the second 
unrewarded (or punished) under à 
given set of conditions, while the 
consequences of the two courses of 
action are reversed under another set 
of conditions, and erroneous choices 
are counted. For example, in a con- 
ventional jumping apparatus, a jump 
to the right window is rewarde 
when the card in the right window i$ 
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white and the card in the left window 
is black, but a jump to the left win- 
dow is rewarded when the positions 
of the two cards are interchanged; or, 
in the same apparatus, response to 
the right window is rewarded when 
two white cards are displayed, but 
response to the left window is re- 
warded when two black cards are dis- 
played. (Problems of the first kind 
have been termed simultaneous“ 
while problems of the second kind 
have been termed successive, an 
adjective applied as well to T-1d 
problems; considerable confusion has 
resulted from the failure to distin- 
guish between T-1d and successive 
T-2d problems.) The so-called 
"higher order" discriminations—od- 
dity, matching-from-sample, and 
multiple choice—also may be clas- 
sified as T-2d problems, although 
they seem to make demands which 
go far beyond those of the simpler 
problems first exemplified. The T-1d 
and T-2d categories actually are 
rather coarse ones which themselves 
invite careful analysis and subdi- 
vision. 

: Like T-2g situations, T-2d situa- 
tions may be "chained"—as when 
an animal is required to make a series 
of choices based on brightness before 
the reward is attained (Stone, 1928). 
Meaningful T-1d chains also are pos- 
sible, although no instance of such a 
chain is to be found in the literature. 
For example, response to a manipu- 
landum in one unit gives immediate 
access to the next unit when the 
positive stimulus is present; when 
the negative stimulus is present, ac- 
cess to the next unit is given after a 
predetermined period of time whether 
9r not the animal responds. 


Discrete and Continuous Situations 


There has been little clarity on the 
telation of Skinner’s technique to 
other techniques for the study of 


learning in animals. It has been 
asserted by Woodworth (1938), for 
example, that the Skinner box brid- 
ges the gap" between the problem 
box and the classical conditioning 
situation, and a similar view is met 
again in Spence (1956), who places 
the Skinner box on a continuum at a 
point intermediate between the meth- 
ods of Thorndike and Pavlov; but 
the notion of continuity is difficult to 
justify. Skinner (1935, 1937) cer- 
tainly has succeeded very well in 
drawing a sharp line between his 
method and that of Pavlov on the 
basis of criteria which fail to dis- 
tinguish his method from that of 
Thorndike. 

Skinnerian situations are Thorn- 
dikian situations as the term is de- 
fined here. The original Skinner box 
differs from the older problem box 
only in that it delivers food to the 
response compartment (instead of 
admitting the animal to a separate 
feeding compartment) when the de- 
fined response is made, a feature 
which eliminates handling of the ani- 
mal between trials. Equipped with a 
retractable lever, which is introduced 
to begin each trial and withdrawn 
after response, the Skinner box may 
be used in exactly the same manner 
as the older problem box; in fact, a 
retractable manipulandum which de- 
livered food to the responding animal 
was developed for the monkey by 
Thorndike himself (1901). Skinner 
(1932), of course, has preferred to use 
his apparatus as a "repeating" prob- 
lem box—his own adjective—invert- 
ing the traditional measure of per- 
formance, and substituting for time 
per response on discrete trials number 
of responses per unit time (rate of 
response) to a continuously available 
lever. Either way, a Skinner box 
containing one lever may be classified 
asa T-1 situation. A single response 
is delineated, its consequences are 
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manipulated, and the readiness with 
which it comes to expression is 
measured. With two levers and the 
study of choice, the Skinner box be- 
comes a T-2 situation. 

It seems reasonable, nevertheless, 
to make a formal distinction between 
Thorndikian situations in which la- 
tencies or choices are measured in 
discrete trials and their Skinnerian 
counterparts in which rates of re- 
sponse are measured under condi- 
tions of continuous opportunity to 
respond. Situations of the first kind 
will be designated as discrete, while 
those of the second kind will be desig- 
nated as continuous, and for symbolic 
purposes the subscripts d (for dis- 
crete) or c (for continuous) will be 
added to the T for Thorndikian, as, 
for example, in Ta-2g (discrete, choice, 
generalized) or in T.-1d (continuous, 
unitary, discriminative). The dis- 
crete-continuous distinction reflects 
the hypothesis that rate of response, 
despite its close mathematical rela- 
tion to latency, has a functional sig- 
nificance which is to a certain extent 
unique, and some interesting evidence 
for this view comes from comparative 
studies of the effect of inconsistent 
reinforcement on resistance to extinc- 
tion. In the rat, discrete and con- 
tinuous techniques both give the so- 
called paradoxical effect (greater re- 
sistance to extinction after incon- 
sistent than after consistent rein- 
forcement). In the fish, initial re- 
sistance to extinction is greater after 
consistent reinforcement in the dis- 
crete case (Longo & Bitterman, 1960; 
Wodinsky & Bitterman, 1959, 1960); 
but some as yet unpublished data 
show greater resistance to extinction 
after inconsistent reinforcement in 
the continuous case. Whether the dif- 
ference in outcome may be traced to 
a difference in the functional proper- 
ties of the two techniques, or whether 
it is a product of certain parametric 
differences between the two sets of 


experiments, remains to be deter- 
mined. The matter is introduced 
here only to suggest the possibility 
that techniques which are function- 
ally equivalent for one species may 
not be so for others. In this connec- 
tion it is worth noting, perhaps, that 
the potentialities of the rate measure 
seem to be realized fully only when 
inconsistency of reinforcement is 
introduced. 


A General Definition of Thorndikian 
Situations 


In each of the Thorndikian situa- 
tions considered thus far, a change in 
behavior is measured which springs 
from a contingency between some de- 
fined response and some motiva- 
tionally significant state of affairs. 
Experiments on latent learning sug- 
gest, however, that a Thorndikian 
situation may be characterized with- 
out reference either to the actual oc- 
currence of change in behavior or to 
the motivational significance of the 
consequences of response. In the T-1 
case, an investigator may set out de- 
liberately to minimize the motiva- 
tional significance of the consequences 
of response in an effort to minimize 
the extent of change in behavior. 
For example, a hungry rat is trained 
in a runway which leads to an empty 
end box or to one which contains only 
water. To arrange a set of end-box 
conditions which are entirely without 
motivational significance is not, of 
course, always very easy, but it can 
be done (Gonzalez & Diamond, 1960). 
In the T-2 case, the consequences of 
alternative responses, whether moti- 
vationally significant or not, may be 
balanced in an effort to forestall the 
development of a preference for one 
or the other response. For example, 
a hungry rat is run in a simple 
maze with both end boxes empty, of 
one empty and the other containing 
only water, or one containing food 
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and the other both food and water; 
oraratthatisboth hungry and thirsty 
is run in a T maze with one end box 
containing food and the other con- 
taining water. Such situations are 
intended merely to provide occasions 
for learning whose effects are esti- 
mated in later tests. The tests always 
involve a change in the motivation- 
al significance of the consequences 
of response: for example, food is 
added to a previously empty end box; 
or the end box is associated with food 
in direct feedings; or the prevailing 
condition of deprivation is altered, 
and with it the relevance of previ- 
ously encountered incentives. Never- 
theless, despite the careful attention 
which must be paid to motivational 
significance in evaluating the out- 
come of exposure to a Thorndikian 
situation, the situation itself may be 
defined without reference to motiva- 
tional significance. What is essential 
only is a contingency of some speci- 
fied event or circumstance on some 
measurable bit of behavior—a con- 
tingency arranged by an investigator 
who is interested in studying its 
effects on the animal? 


* No treatment of Thorndikian techniques 
would be complete without some mention of a 
set of situations closely related to the problem 
box (calling for string-pulling, rake-wielding, 
box-stacking, and the like) which figured 
prominently in the work of certain of Thorn- 
dike's critics, beginning with Hobhouse 
(1901), who did not think that Thorndike's 
apparatus provided a representative picture 
of animal intelligence. Designed to be fully 

Surveyable" (to conceal nothing from the 
animal) and, although simple in principle, to 
render “chance” solutions unlikely, these 
(Hobhousian) situations present Thorndikian 
Contingencies of a rather loose sort and may be 
used, like Thorndike's problem boxes, to 
Gee the way in which the experience of such 
The ‘Ingencies affect subsequent behavior. 

eir principal use, however, has been in in- 
quie into the ability of animals to discover 
far update modes of behavior in advance of 
oe orcement—that is, in quests for evidence 
ee or “inferential” as contrasted 
reproductive" or learned solutions. 


PAVLOVIAN SITUATIONS 
Well before Pavlov’s experiments 
on conditioning became widely 
known, other investigators were led 
quite independently, by an interest in 
associative learning, to experiments 
of essentially the same kind. As far 
back as the turn of the century, a 
distinction was made between what 
was called “trial-and-error” or ‘‘selec- 
tive learning! the modification of 
behavior as a function of its conse- 
quences—and what was called ''asso- 
ciation of stimuli" or substitution“ 
the acquisition by one stimulus of 
some of the behavioral properties of 
a second stimulus as a function of the 
pairing of the two stimuli. Primarily 
concerned though he was with selec- 
tive learning, Thorndike (1898) him- 
self made use of paired stimulation; 
when a verbal statement such as “I 
must feed those cats" was followed 
regularly by the presentation of food, 
he reported, the words alone would 
bring the animals to the feeding place. 
It seems fitting nonetheless—in view 
of the scope of Pavlov’s (1927) con- 
tribution—that the method should 
bear his name. 
In the traditional Pavlovian experi- 
ment, as in the traditional Thorn- 
dikian experiment, the behavior of 
the animal is altered by the introduc- 
tion of some motivationally signifi- 
cant stimulus such as food or shock 
“reinforcement”’), but there are im- 
portant differences. Ina Thorndikian 
experiment, reinforcement is con- 
tingent on response; doing one thing 
leads to food or to shock, doing 
another does not. In a Pavlovian 
experiment, reinforcement is sched- 
uled without regard to response; the 
experimenter does not set out to 
mold behavior in some predeter- 
mined fashion, but only to study the 
way in which the functional proper- 
ties of one stimulus are altered by 
virtue of its contiguity with another. 
Because their introduction is not 
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contingent on the animal's behavior. 
Pavlovian reinforcements cannot be 
treated as rewards or punishments in 
any meaningful manner, nor can re- 
wards and punishments be distin- 
guished in a Pavlovian experiment. 
Another difference between the two 
techniques is worth noting. In a 
Thorndikian experiment, the choice 
of the behavior which is to serve as 
the index of learning is independent 
of the choice of reinforcement; any 
of a large variety of responses which 
the animal is likely to make may be 
encouraged with food or discouraged 
with shock. In a Pavlovian experi- 
ment, the choice of reinforcement 
restricts the choice of a behavioral 
indicator; while the conditioned and 
unconditioned responses are not al- 
ways (as Pavlov thought) identical, 
the investigator must be guided in 
his search for evidence of learning 
by the functional properties of the 
reinforcing stimulus. Sharp as the 
distinction may be between the tradi- 
tional Thorndikian and Pavlovian 
procedures, it has been ignored very 
often by theorists preoccupied with 
the task of deriving all of the data 
of learning from the operation of 
a single process. Pavlov himself 
claimed, of course, that all instances 
of learning could be analyzed as in- 
stances of conditioning, although 
Thorndike, committed as he was to 
the generality of the law of effect, 
never was satisfied that Pavlov's 
procedure could be cast in the same 
mold as his own. 

Coordinate with the unitary Thorn- 
dikian (or T-1) situation is the uni- 
tary Pavlovian (or P-1) situation, in 
which the tendency for a CS to pro- 
duce some defined effect is measured 
in terms of latency or magnitude. The 
defined effect may bea response which 
is reflexly elicited by the US, as in the 
salivary conditioning experiment, or 
something quite different, as when 


the rate of fixed-interval responding 
in a Skinnerian situation is depressed 
by shock and by a stimulus paired 
with shock (Estes & Skinner, 1941). 
A P-1 situation may be generalized 
(P-1g) or discriminative (P-1d); in 
the discriminative case, the CS is 
varied systematically from trial to 
trial and with it the likelihood that 
the US will be presented (as, for 
example, when a bright light always 
is followed by food but a dim light 
never is). With two unconditioned 
stimuli, each eliciting a different re- 
sponse, it is possible to set up a P-2 
situation, the Pavlovian analogue of 
the Thorndikian choice situation. 
(A T-2 situation may be consti- 
tuted with but a single reinforcer, 
which is another interesting difference 
between Pavlovian and Thorndikian 
techniques.) The discriminative (P- 
2d) case is perhaps the easier to con- 
ceive than the generalized (P-2g). 
For example, one CS is paired with 
acid introduced into the mouth of a 
dog, while another CS is paired with 
meat-powder (Pavlov, 1927). The 
P-2g case must involve some in- 
consistency of reinforcement (which 
is, or course, not true of T-2g). For 
example, a CS is paired with shock 
to the right forelimb on a random 
75% of trials and with shock to the 
left forelimb on the remaining 25% 
of trials. This is the Pavlovian ana- 
logue of a kind of T-2 situation in 
which there has been much interest 
of late. For example, a right turn at 
the choice point of a maze leads to 
food on a random 75% of trials while 
a left turn leads to food on the re- 
maining 25% of trials (Brunswik, 
1939). 

The discrete-continuous dichot- 
omy, which was developed in the 
analysis of Thorndikian procedures, 
seems to have no Pavlovian parallel; 
Pavlovian training is an affair of 
discrete trials. Nor does the notion 
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of "chaining" have any application 
to Pavlovian procedures. 

A Pavlovian situation, like a 
Thorndikian situation, may serve 
merely as an occasion for learning 
whose effects are measured in sub- 
sequent tests. One such case, well 
known to Pavlov, is that in which 
the presentation of CS and US is 
strictly simultaneous; only when the 
training procedure is altered can the 
effects of pairing be assessed. A 
second is that of "sensory precondi- 
tioning"—conceived originally by 
Thorndike (1898) himself as a check 
on the existence of representations“ 
— which is analogous to the Thorn- 
dikian experiment with consequences 
of response which are lacking in 
motivational significance; neutral 
stimuli are paired, then one is given 
some behavioral property, and the ef- 
fects of the pairing are estimated from 
response to the other. A third case 
is that in which attention is centered 
on the acquisition, not of response- 
eliciting properties, but of rewarding 
properties (Williams, 1929); for ex- 
ample, an animal is fed repeatedly in 
a distinctive box (that is, box and 
food are paired), after which access 
to the empty box is made contingent 
upon response ina Thorndikian situa- 
tion. In one variety of experiment 
which has considerable theoretical 
importance, the order of these ex- 
periences is reversed; the contin- 
gency of access to the empty box 
upon some response is displayed, 
after which the animal is fed in the 
box, and the effect on response is 
measured (Gonzalez & Diamond, 
1960). In general, then, a Pavlovian 
Situation may be defined without 
reference either to the occurrence in 
that situation of any particular kind 
of behavioral change, or to the func- 
tional properties of the stimuli which 
are paired. What is essential only is 
4 sequence or conjunction of stimuli 


whose contiguity is independent of the 
animal's response. 


AVOIDANCE SITUATIONS 


The only learning situations which 
cannot be classified unequivocally as 
Pavlovian or Thorndikian are those 
which involve the avoidance of aver- 
sive stimulation. In them, Pavlovian 
and Thorndikian features are closely 
intertwined. On the one hand, a 
neutral stimulus is paired with an 
aversive stimulus, thereby acquiring 
certain arousing properties. The 
pairing is not, on the other hand, 
entirely independent of the animal's 
behavior—the aversive stimulus is 
introduced only if the CS fails to 
elicit some defined response, whose 
likelihood of occurrence (low at the 
outset) the pairing serves to increase. 
This contingency of reinforcement 
on response is not displayed on the 
very first trial, as it is in a pure 
Thorndikian situation. In avoidance 
training, the contingency is a nega- 
tive one, which (since the mere pos- 
sibility of avoidance cannot influence 
the animal) does not become mani- 
fest until the Pavlovian procedure 
has taken effect. 

There is another Thorndikian con- 
tingency which operates in some 
(though not in all) avoidance situa- 
tions, this one making itself felt from 
the very first trial: termination of 
the aversive stimulus may be con- 
tingent on some defined response, 
often—but not always—the same re- 
sponse as that which avoids the 
aversive stimulus. In flexion condi- 
tioning, when shock to the limb is 
administered through a grid on which 
the limb of the animal rests, and 
when the scheduled duration of shock 
is substantial, flexion both escapes 
and avoids shock. In the shuttle box, 
too, the conditions of training may be 
such that changing compartments 
both escapes and avoids shock, al- 
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though, as Warner (1932) noted 
early, the response which escapes 
shock may be different from that 
which avoids it (for example, leaping 
over a hurdle as compared with crawl- 
ing under). It is possible, of course, 
to set up an avoidance situation in 
which there is no escape at all. In 
flexion conditioning, shock may be 
administered through a bracelet at- 
tached to the limb, and a control cir- 
cuit so arranged that the CR will 
forestall the shock but the UR will 
not alter its scheduled duration. In 
the shuttle box, the shock may be 
very brief, terminating quite inde- 
pendently of any response the animal 
may make to it (Hunter, 1935). 
Even without escape, however, there 
remains the contingency of aversive 
stimulation on failure of response to 
the CS, an essential feature of avoid- 
ance training which distinguishes it 
from Pavlovian training, while the 
paired stimulation which is respon- 
sible for the emergence of response to 
the CS distinguishes it from Thorn- 
dikian training. Avoidance training 
seems to require a major category of 
its own. 

In its most common use, the shut- 
tle box may be classified as an Ag-1g 
situation (A for avoidance); a single 
course of action is defined, and its 
latency is measured in discrete trials 
without systematic variation in sen- 
sory conditions. The corresponding 
discriminative (As-1d) situation also 
may be generated in the shuttle box; 
for example, a bright light is followed 
by shock unless the defined response 
is made, but a dim light never is 
followed by shock. In such a situa- 
tion, it may be noted, discrimination 
can progress only as the animal fails 
to respond to the dim light, since the 
consequences of response to the two 

lights are identical. (In a T-1d situa- 
tion, by contrast, the consequences 
of response to the stimuli to be dis- 


criminated are different, and dis- 
crimination therefore is facilitated by 
response to the negative stimulus; in 
a P-1d situation, discrimination may 
progress quite independently of re- 
sponse.) 

Choice among alternative courses 
of action also may be studied in 
avoidance situations. Suppose, for 
example, that shock from a grid in 
the floor of a T maze is scheduled x 
seconds after an animal is placed in 
the starting box. In the generalized 
(Aa-2g) case, shock is avoided by 
prompt entrance into the end box on 
the right, but not by entrance into 
the end box on the left. In the dis- 
criminative (Aa-2d) case, a turn to 
the right avoids shock when the stem 
of the maze is black, while a turn to 
the left avoids shock when the stem 
is white. Two unconditioned stimuli 
are not required to generate an A-2 
situation as they are to generate a P-2 
situation, but two unconditioned 
stimuli may be used. For example, 
one signal is followed by avoidable 
Shock to the right limb, while a sec- 
ond is followed by avoidable shock 
to the left limb (James, 1947). 

The discrete-continuous dichotomy 
developed in the analysis of Thorn- 
dikian situations is applicable also to 
avoidance training. An A,-1g situa- 
tion may be constituted in a modified 
Skinner box or a shuttle box. Ina 
design developed by Sidman (1953), 
no exteroceptive warning signal is 
used, but shock is scheduled every X 
seconds by a clock which the defined 
response resets. (The lack of an 
exteroceptive signal does not, of 
course, subvert the definition of 
avoidance training as originating in a 
quasi-Pavlovian contiguity of stim- 
uli; as Pavlov himself showed, in- 
ternal processes correlated with the 
passage of time since the occurrence 
of a specified event may be cast in 
the role of CS). In the corresponding 
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discriminative (A.- Id) case, the clock 
which schedules shock runs only 
under one of two sensory conditions. 
Avoidance situations of the continu- 
ous type which do involve exterocep- 
tive signaling also are feasible. In the 
A,-1g case, for example, shock from 
a grid in the floor of a Skinner box 
is scheduled x seconds after the onset 
of a light and avoided by response on 
a variable-ratio schedule. A.-2 situa- 
tions, both generalized and discrim- 
inative, may be generated when 
alternative courses of action are de- 
fined. 

Like Thorndikian situations, avoid- 
ance situations may be chained. Just 
as an animal may learn to run a 
simple T maze under threat of shock, 
so it may learn to run a multiple T 
maze. An example of chaining in an 
avoidance situation of the continuous 
typeis the following: with the onset 
of the CS, response to one manipu- 
landum is followed, on a variable- 
ratio schedule, by access to a second 
manipulandum, response to which, 
again on a variable-ratio schedule, 
terminates the CS and avoids shock. 

Although the term implies threat 
of an aversive condition which the 
animal learns to forestall, avoidance 
training, like Thorndikian and Pav- 
lovian training, may be characterized 
without reference to the nature of the 
stimuli employed or to the occurrence 
of behavioral change. It would be 
possible, for example, to train an 
animal with some neutral stimulus 
rather than shock in a shuttle box 
designed to produce a substantial 
frequency of spontaneous crossing, 
and then to test for learning after the 
neutral stimulus has been paired with 
shock. Irrespective of outcome, the 
conception of such an experiment is 
sufficient to delineate what is here 
regarded as the essential feature of 
avoidance training: a sequence of 
stimuli is scheduled with the occur- 


rence of the second contingent upon 
the failure of the animal to make some 
specified. response to the first. 


TERMINOLOGY 


While there need be no detailed 
comparison of the classification here 
proposed with earlier ones, it may be 
worth while, in the interest of pre- 
serving whatever compatible usages 
may exist, to consider how well some 
of the broader methodological desig- 
nations which now are current will 
serve the needs of the new classifica- 
tion. Since current terminology 
derives from earlier classifications, 
the major differences in emphasis 
must become quite apparent in the 
process. 

The term ‘‘conditioning’’ usually 
is used for the kind of training here 
called Pavlovian, but that term also 
is used rather widely to designate 
techniques which are not here classi- 
fied as Pavlovian, and often as a 
synonym for learning“ itself. The 
term “classical conditioning" is closer 
to what is here intended by Pavlovian, 
although in some contexts it has a 
narrower meaning (suggesting a har- 
nessed animal) and in other contexts 
a broader one (encompassing avoid- 
ance). Avoidance remains a useful 
term, but "instrumental condition- 
ing" is too ambiguous, since it has 
been applied indiscriminately both to 
avoidance training and to pure 
Thorndikian training. The term 
“operant conditioning” is even more 
ambiguous; it has a narrow (Skin- 
nerian) sense in which it is tied to a 
questionable distinction between 
“elicited” and “emitted” behavior, as 
well as a more general sense in which 
it is equivalent to instrumental con- 
ditioning. The term "selective learn- 
ing" has a pure Thorndikian conno- 
tation, but it seems to designate a 
process of learning rather than a 
method of studying it. 
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In general, there is little to salvage 
in the current terminology. Specific 
situational designations, such as maze, 
problem box, and runway, continue 
to be useful, but the broader classi- 
ficatory terms are unsuitable because 
they are geared to methodological 
dichotomy rather than to trichotomy. 
Even if dichotomy should in time 
give way to trichotomy, of course, it 
is likely that many of the older terms 
will continue to be used with altered 
meanings and with considerable con- 
sequent confusion. The terms for the 
subcategories here defined—unitary 
and choice situations, generalized and 
discriminativesituations, discrete and 
continuous situations—fortunately do 
not compete with established usages 
and therefore create less opportunity 
for confusion, although it is possible 
that a clearer notation might be 


found. Reflection will show, how- 
ever, that complexity of notation is to 
a certain extent an inevitable con- 
sequence of the amount of informa- 
tion to be conveyed. 

It is natural that a new classifica- 
tion should require a new terminol- 
ogy, although a change in classifica- 
tion does not, of course, necessarily 
imply an advance in conception. 
Whether the classification here pro- 
posed represents an advance in think- 
ing about the interrelations among 
learning situations cannot now be 
told. Classification is more, ulti- 
mately, than a matter of taste, but 
there is little else on which to depend 
at the present time. Itis to be hoped 
thatarenewed concern with problems 
of classification will stimulate fur- 
ther research on methodological inter- 
relations. 
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The present review of studies of 
persistence has two main aims: 

1. Of primary importance, it will 
consider the different approaches 
which have been made in the litera- 
ture to the investigation of persist- 
ence with humans. 

2. Asa more specialized aim, it will 
attempt to clarify the relationship of 
persistence at a task to motives and 
expectations by considering the fol- 
lowing two questions: How does the 
initial ease or difficulty of a task in- 
fluence the persistence displayed by 
a subject in his attempt to perform 
the task? How do individual differ- 
ences in strength of motive to achieve 
success (M,) and strength of motive 
to avoid failure (Mat) influence per- 
sistence at a task? 

The general paradigm of the per- 
sistence situation is that in which a 
person is confronted with a very dif- 
ficult or insoluble task and is unre- 
stricted in either the time or number 
of attempts he can work at it. He is 
unsuccessful at each of these at- 
tempts at the task, but can turn to 
an alternative activity whenever he 
wishes. Persistence may be measured 
by the total time or total trials which 
the person works at the task before 
he turns to the alternative activity. 
The former measure is sometimes re- 
ferred to in the literature as tem- 
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poral persistence, the latter measure 
is analogous to resistance to extinc- 
tion. 

Persistence may be distinguished 
from the performance level or effort 
involved in an activity and from the 
direction which an activity takes, 
but belongs with both of these as an 
important behavioral symptom of 
motivation. This distinguishing 
characteristic of motivated behavior 
has long been recognized in the liter- 
ature. To cite a few examples of its 
widespread recognition, McDougall 
(1908), in his discussion of instinct, 
lists persistence as one of the objec- 
tive features of purposive behavior; 
Tolman (1932), while rejecting the 
mentalistic teleology of McDougall's 
position, considers persistence-until- 
ends-are-attained as a basic criterion 
for molar, purposive behavior; Lewin 
(1935) discusses the persistence of 
tension within the regions of the per- 
son, a conception which has a crucial 
part in the interpretation of the re- 
search concerning rigidity, substitute 
activity, and interrupted tasks (cited 
in Lewin, 1946); and both Hull (1943) 
and Dollard and Miller (1950), 
within the context of drive theory, 
are concerned with the problem of 
continuing action. More recently 
Peak (1955) and Atkinson (1957) 
have emphasized that a theory of 
motivation has as one of its impor- 
tant aims the conceptualization of 
persistence in behavior; and Bindra 
(1959), arguing within the gen- 
eral framework of Hebb’s concepts 
(1949), considers persistence as one 
of the defining characteristics of goal 
directed action. Thus, there is no 
lack of recognition of the importance 
of accounting for persistence in be- 
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havior despite diversity in conceptual 
approaches. 

The background research falls into 
three fairly distinct classes. The first 
class comprises studies which are con- 
cerned with persistence as a trait or 
uniformity in behavior. Typically, 
these study—by correlational tech- 
niques—relationships between per- 
sistence scores (usually in terms of 
time) for a variety of different types 
of task. In more recent research, fac- 
tor analytic methods have been used 
in an attempt to account for the ob- 
tained correlations. The overriding 
interest in these studies is in con- 
sistency in behavior. Will a subject 
who persists at one task also tend to 
persist at another? Consistency, 
where it occurs, is assumed to allow 
the inference of a relatively stable 
personality characteristic. The role 
of situational factors in determining 
behavior tends to be ignored since 
the emphasis is on personality struc- 
tures or traits which transcend the 
situation. To the extent that mo- 
mentary situational influences are 
excluded from consideration, the 
trait approach commits the or- 
ganism error” (MacKinnon, 1944). 

T he second class of studies com- 
prises those with humans which are 
concerned with the problem of re- 
sistance to extinction. Although they 
are not commonly discussed as per- 
sistence studies, the structure of the 
situation is to some extent similar to 
that employed in studies where per- 
sistence is conceived as a trait. In 
the trait studies a common technique 
is to measure the time for which the 
Subject persists at a very difficult, 
effortf ul, or insoluble task without suc- 
cess, i.e., temporal persistence. In the 
extinction studies the subject typ- 
ically performs a task without rein- 
forcement after having been sub- 
ud to a particular type of rein- 
orcement schedule during an acqui- 


sition series. Number of trials to ex- 
tinction is taken as the measure. 
This similarity should not blind us to 
important differences between these 
situations. Extinction studies gen- 
erally ignore the possible effect of 
relatively stable personality differ- 
ences and focus more on the influence 
of situational variables, particularly 
differences in the pattern and amount 
of reinforcement in the acquisition 
series. In this respect they are in 
marked contrast to the personality 
oriented trait studies and, to the ex- 
tent that they exclude personality 
differences from consideration, they 
commit the "situation error" (Mac- 
Kinnon, 1944). 

Finally, the third class of studies 
comprises those in which persistence 
is conceived as a motivational phe- 
nomenon. On the one hand the 
theory may consider persistence 
mainly in terms of situational param- 
eters leaving personality variables 
relatively unspecified. Lewinian field 
theory appears to come close to this 
approach in its detailed analysis of 
factors in the psychological environ- 
ment such as valences and barriers, 
and its relative lack of an explicit 
analysis of the individual. On the 
other hand the approach may be 
more thoroughly interactive, as in 
the theory of achievement motiva- 
tion (Atkinson, 1957, 1960) which 
conceives of stable personality dis- 
positions or motives in interac- 
tion with expectations and incentive 
values which are both situationally 
defined. It is this latter approach 
that permits clarification of the ques- 
tion of how persistence might be re- 
lated to differences in strength of 
achievement related motives and ini- 
tial expectation of success. Hence, 
more attention will be given to the 
experimental literature from this 
area and only a partial sampling of 
studies from the other two classes of 
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investigation will be presented. 

The three classes of studies may be 
seen as falling on a continuum with 
personality oriented trait studies at 
one end, situation oriented extinction 
studies at the other end, and studies 
which consider the interaction of per- 
sonality and situation between the 
two extremes. 


PERSISTENCE CONCEIVED AS A 
TRAIT UNDERLYING BEHAVIOR 


These investigations fall into two 
main groups: 

1. Those which investigate corre- 
lations between persistence scores for 
a large number of different tasks, 
or correlations between persistence 
scores and other variables such as 
age, intelligence, or academic suc- 
cess, but which do not proceed to a 
factor analysis. These, in general, 
are the early trait studies as exempli- 
fied by the classical research of 
Hartshorne, May, and Maller (1929). 
They will be designated ‘‘nonfactorial 
trait" investigations of persistence. 

2. Those which do proceed to a 
factor analysis. These are designated 
"factorial trait" investigations of per- 
sistence. 

The transition from 1 to 2 exempli- 
fies the initial steps in the evolu- 
tion towards a scientific explanation. 
There is progression from the concept 
of persistence as a property of per- 
sons, some possessing it more than 
others, to the attempt to classify dif- 
ferent types of persistence. It will be 
argued, however, that classification 
is only a preliminary step, albeit a 
most important one, and that the 
factor analytic studies fail to pro- 
vide an adequate conceptualization 
of why people differ in persistence be- 
tween situations. 


Nonfactorial Trait Studies 


Many of these early investigations 
of persistence are reviewed by Ryans 


(1939). Perhaps the best illustration 
of a nonfactorial trait study is to be 
found in the monumental research of 
Hartshorne, May, and Maller (1929). 
In this investigation a wide variety of 
tasks were used, some of which were 
administered individually, others in a 
group setting. The eight persistence 
tests that were employed consisted of 
story resistance, puzzle mastery, pa- 
per and pencil puzzle solution, fatigue 
and boredom in mental work, hunt- 
ing for hidden objects, continued 
standing on right foot, eating cracker 
and whistling, and solving a toy puz- 
zle. The reliability coefficients were 
found to range from .40 to .85. Va- 
lidity coefficients obtained by com- 
paring test results with teachers' rat- 
ings of persistence were from zero to 
33. The correlations between the 
various tests were generally low. 
There was low positive correlation 
between results of persistence tests 
and intelligence test scores. Some 
tendency was found for persistence 
to increase with age for the age 
range (9-16 years) investigated. 

In common with the foregoing 
study, other early trait studies of 
persistence investigated an extensive 
and often bewildering variety of 
tasks which ranged from subjective 
ratings of persistence, through diffi- 
cult or insoluble puzzle tasks, to 
measures of physical endurance. Per- 
sistence was usually measured by 
total time taken at the task. In ad- 
dition to differences in the types of 
task employed one would also expect 
differences in the test context in 
which these tasks were presented, 
e.g., the degree to which the situation 
was achievement oriented, whether 
or not the task was administered in- 
dividually or to a group. In view of 
these differences it is not surprising 
that intercorrelations of persistence 
Scores were often low. With the in- 
creasing use of factor analytic meth- 
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odology it was natural for investi- 
gators to look for order in the con- 
comitances and to go beyond the 
often puzzling examination of the cor- 
relation table to the statistical intri- 
cacies of factor extraction. 


Factorial Trait Studies 


One of the earliest factor analytic 
studies concerned with persistence 
was carried out by Webb (1915). 
Working within the context of a 
Spearman analysis, he correlated rat- 
ings and isolated a W factor which 
was thought to comprise component 
traits such as reliability, tact, and 
persistence of motives. Crutcher 
(1934) in another early factorial in- 
vestigation tested London school 
children (age range 7-16 years) on 
persistence tests including card-house 
building, mechanical puzzle solution, 
addition, picture copying, and can- 
celing A's. He found a correlation of 
-30 between persistence and intelli- 
gence, and a tendency for more per- 
sistent children to be slightly in- 
clined towards introversion. When 
the intercorrelations between time 
taken on each of the various per- 
Sistence tasks by his subjects were 
analyzed by the Spearman tetrad 
method, Crutcher found some evi- 
dence for a general factor. Alexan- 
der (1935) identified an X factor, 
Which ran through school subjects 
but not through ability measures, as 
involving persistence, and this was 
Consistent with the generally ob- 
tained positive correlation between 
Persistence measures and school suc- 
cess, 

Later investigations by Ryans 
(1938), Thornton (1939), Rethling- 
Schafer (1942), Kremer (1942), and 

acArthur (1955) are reviewed by 
Eysenck (1953) in some detail. He 
Considers MacArthur’s study as meth- 
odologically the most satisfying. 

acArthur, following preliminary in- 


vestigation, selected a large number 
of individual and group tests for ad- 
ministration, including the tradi- 
tional tests which had been used in 
the measurement of persistence, as 
well as measures of intelligence, 
school grades, age, self-ratings, peer 
ratings, and ratings by teachers. 
Subjects were 120 boys and the influ- 
ence of ability on persistence was 
partialed out. A Thurstone analysis 
of intercorrelations yielded five fac- 
tors as follows: (a) General persist- 
ence, with peer ratings (.603) and 
time spent on magic square (.584) 
having the highest saturations; this 
factor ran through both ideational 
and physical measures, word build- 
ing had a saturation of .472 and 
maintained handgrip .432. (b) A bi- 
polar factor contrasting individuality 
with prestige suggestibility; tests in 
which subject had no knowledge 
about performance of classmates 
were positively loaded; tests where 
this knowledge was available were 
negatively loaded; this factor re- 
sembled Kremer's "will to com- 
munity” factor. (c) A bipolar factor 
contrasting measures of reputation 
for persistence with objective meas- 
ures of time spent by subject at the 
task; this reputation factor resembled 
Kremer's "stability of character" 
factor; further, measures opposite in 
sign were very similar to those defin- 
ing Thornton's "keeping at a task" 
factor running through ideational 
tests. (d) A factor running through 
the physical tests and closely resem- 
bling Thornton's “withstanding dis- 
comfort to achieve a goal" and Reth- 
lingshafer's "willingness and/or abil- 
ity to endure discomfort." (e) A fac- 
tor running through spatial and nu- 
merical tasks and interpreted as 
spatial-numerical persistence. Mac- 
Arthur'sresults thus brought together 
in the one study many of the previous 
findings. A persistence score based 
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on a combination of the eight tests 
with the highest communalities cor- 
related .30 with school marks with 
intelligence partialed out, and had an 
index of reliability of .90. MacArthur 
found it best to avoid suggesting that 
persistence was desirable and pro- 
vided activities to which subject 
could turn when he had spent as long 
as he wished at the task. 

Eysenck (1953) summarizes the 

general results of the foregoing 
studies as follows: 
The evidence is fairly conclusive that per- 
sistence constitutes an important trait in our 
culture; that this trait is of relatively unitary 
nature and can be measured to the extent in- 
dicated by a validity of .9. In addition to this 
general factor of persistence, we find groups of 
activities which cluster together and define 
more specific types of persistence, such as per- 
sistence in physical tasks or persistence in 
ideational tasks. These smaller and less im- 
portant factors also are subject to measure- 
ment with a degree of validity probably not 
much below general persistence itself. Per- 
sistence as measured by tests is fairly closely 
related to persistence as rated by others, and 
can be said to predict performance in life 
situations to a definitely significant extent. 
Persistence tends to show slight correlations 
with intelligence, more impressive ones with 
“w” or lack of neuroticism and with introver- 
sion (p. 290). 


These latter results, involving neu- 
roticism and extraversion-introver- 
sion, were discovered in research 
conducted in Eysenck’s laboratory 
(1947, 1952). Persistence in these in- 
vestigations was measured by a phys- 
ical endurance test (holding leg above 
an adjacent chair). Recently, Ey- 
senck (1957) has attempted to give a 
theoretical account of the relation- 
ship of persistence to personality di- 
mensions using assumptions about 
the development of inhibitory and 
excitatory potentials, a conception 
which is related to Pavlov’s (1927) 
theory of experimental neurosis. Ex- 
traverts are assumed to be individ- 
uals in whom excitatory potential is 
generated slowly and at a weak level, 
but in whom inhibitory potential de- 


velops quickly to a strong level and 
then dissipates slowly. In contrast, 
introverts are assumed to be indi- 
viduals in whom excitatory potential 
is generated quickly and at strong 
level, but in whom inhibitory poten- 
tial develops slowly to a weak level 
and then dissipates quickly. Differ- 
ences between introverts and extra- 
verts in persistence are related to the 
differential in inhibitory potential. 
The stronger inhibitory potential de- 
veloped in extraverts would lead to 
the expectation that they should 
show relatively less persistence at a 
task than introverts. This expecta- 
tion is supported in the investigations 
using the physical endurance test. 


Evaluation of Trait Studies 


Without question the factor ana- 
lytic approach to the study of per- 
sistence marked an advance over 
the earlier simple correlation-type 
studies. While these latter investiga- 
tions did contribute a rich variety of 
tasks, they often appear to have in- 
volved a rather uncritical lumping to- 
gether of diverse activities without 
much attempt to account for similari- 
ties and differences in results. The 
factor analytic studies contributed to 
the classification of these tasks and 
thus assisted in bringing some order 
into a complex pattern of relation- 
ships. In this respect they repre- 
sented a step in the direction of sci- 
entific progress. But, while it is of 
value to know, for example, that ide- 
ational persistence tasks separate in 
factor analytic research from phys- 
ical persistence tasks and that there 
are differences in persistence for tasks 
administered in a group setting and 
tasks administered individually, this 
information itself serves only as a 
starting point. The scientist typ- 
ically goes beyond classification to ex- 
planation. He wants to know not 
only how phenomena may be grouped 
together, but also how they are uni- 
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formly related to other phenomena in 
terms of a conceptual framework 
which will provide both a scientific 
account of extant findings and a 
promising direction to new research. 
Factor analytic studies of persist- 
ence are theory oriented only insofar 
as they invoke the concept of trait 
which carries with it the implication 
of a stable structure transcending the 
immediate situation. But it is argued 
that such a concept offers only an in- 
complete understanding. The dis- 
covery of the fine-grain of the ob- 
served phenomenon will involve not 
only knowledge of personality struc- 
tures but also a thorough understand- 
ing of the role of the immediate situa- 
tion in relation to behavior, and the 
interaction between such situational 
factors and personality dispositions. 
It should be possible to account both 
for differences between individuals in 
persistence in the one situation, and 
differences in persistence for the one 
individual in different situations. 

The next set of studies to be dis- 
cussed certainly does not satisfy this 
interactional requirement. In fact, 
these investigations appear to stand 
at the opposite extreme from the 
trait studies in that they examine the 
effect of variations in situation on 
Persistence and tend to ignore rela- 
tively stable personality variables. 
Thus, like the trait studies, they are 
one-sided. They are, however, of in- 
terest in showing how persistence, in 
terms of resistance to extinction, 
varies with changes in an acquisition 
Series. This type of study may be 
Conceived as bearing more upon the 


lationship of persistence to expecta- 
n. 


PERSISTENCE CONCEIVED AS 
RESISTANCE TO EXTINCTION 
on e previously, studies 
e in which resistance to 
E a 10n is related to different types 
reinforcement schedules are not 


commonly classified as persistence in- 
vestigations. However, continuing 
an activity in the face of uniform 
nonreinforcement is similar to the fa- 
miliar persistence situation in which 
the subject works at a task without 
success. The relevance of extinction 
studies to the present review is in- 
creased further by the fact that one 
common interpretation of the ob- 
tained results relates the number of 
responses to extinction to a concept 
of expectancy and its manner of 
change. The studies considered in the 
present section are restricted to those 
in which discussion of results involves 
some reference to a concept of ex- 
pectancy. The reader is referred to 
reviews by Jenkins and Stanley 
(1950) and Lewis (1960) for compre- 
hensive coverage of the partial rein- 
forcement literature. 

The general finding in these extinc- 
tion studies is that: 
All other things equal, resistance to extinction 
after partial reinforcement is greater than 
after continuous reinforcement when be- 
havior strength is measured in terms of single 
responses (Jenkins & Stanley, 1950, p. 222). 


Jenkins and Stanley review a large 
number of studies which support this 
conclusion. 

For example, Humphreys (1939b) 
in an early study of eyelid condition- 
ing found that random alternation of 
reinforcement and nonreinforcement 
led not only to as much conditioning 
as reinforcement per trial but also to 
greater resistance to extinction. Dur- 
ing extinction, responses for subjects 
who were partially reinforced first in- 
creased in frequency and then de- 
creased. This result can be consid- 
ered in terms of expectancy theory 
since the subject would presumably 
have a greater expectation of a rein- 
forcement after one or two nonrein- 
forcements, particularly since in the 
acquisition stage of the experiment 
there were never more than two suc- 
cessive nonreinforcements. Hum- 
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phreys argued that the shift from in- 
termittent reinforcement to uniform 
nonreinforcement must have led 
more slowly to the expectation of uni- 
form nonreinforcement for subjects 
who were partially reinforced, while 
this change in expectation would oc- 
cur more rapidly for subjects who 
were uniformly reinforced during ac- 
quisition. 

In a subsequent experiment, Hum- 
phreys (1939 a) used an apparatus in 
which two lights were arranged on a 
board. The subject had to guess 
when one of these lights was turned 
on, whether or not it would be fol- 
lowed by the other light. Half of the 
subjects were trained with the first 
light invariably following the sec- 
ond; for the other half the second 
light was turned on only in random 
alternation so that it appeared half 
of the time. Humphreys found that, 
under extinction conditions (i.e., the 
first light was never followed by the 
second), the uniform reinforcement 
group quickly developed the hy- 
pothesis of uniform nonreinforce- 
ment. The intermittent reinforce- 
ment group showed an initial rise in 
expectation followed by gradual ac- 
ceptance of the hypothesis that there 
would be no second light. 

More recently, Lewis and Duncan 
(1958) have studied the way in which 
variation in the frequency of reward 
during an acquisition stage and vari- 
ation in the length of an acquisition 
stage affects the number of trials to 
extinction and stated expectation of 
reinforcement. An electronic slot 
machine was used in the experiment 
and was set to pay off according to a 
prearranged schedule when buttons 
were pushed and a lever was pulled. 
In a factorial design, a constant pay- 

off occurred on 33%, 67%, and 100% 
of trials during acquisition, com- 
bined with 3, 6, 12, and 21 acquisition 
plays. Following the last acquisition 


play extinction began with no fur- 
ther payoffs. The subject could play 
the machine for as long as he liked. 
Lewis and Duncan found that the 
larger the number of acquisition 
trials the less the plays to extinction. 
There was a tendency for smaller per- 
centages of reinforcement to be as- 
sociated with a greater number of 
plays to extinction, the usual partial 
reinforcement effect. Expectancies, 
measured before each trial by a rat- 
ing technique, were found to increase 
differentially during acquisition as a 
direct function of the percentage of 
reinforcement. Expectancies de- 
creased differentially during extinc- 
tion as a direct function of the per- 
centage of reinforcement, a result 
which is not inconsistent with the 
Humphreys interpretation. Finally, 
there was no clearcut statistical evi- 
dence that number of acquisition 
trials had any effect upon expectan- 
cies. In an earlier study (Lewis & 
Duncan, 1957) using the same ap- 
paratus, it was found that when 
magnitude of reward was varied, 
larger amounts of reward were as- 
sociated with more plays to extinc- 
tion. 

Lewis and Duncan (1958) main- 
tain that their results are consistent 
with a discrimination hypothesis 
concerning extinction (Bitterman, 
Federson, & Tyler, 1953). Extinction 
is assumed to occur because it is clear 
to the subject that no more rewards 
will occur. He responds to the acqui- 
sition series as a whole. Under condi- 
tions where extinction conditions are 
similar to acquisition conditions, 
discrimination is difficult and extinc- 
tion therefore prolonged. ^ Where 
acquisition and extinction series are 
very dissimilar, discrimination is 
easier and extinction therefore more 
rapid. Percentage of reward is an 
important factor in similarity-dis- 
similarity of the two series. If the 
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percentage of reward is low in the 
acquisition series, then the acquisi- 
tion series is more similar to the ex- 
tinction series; discrimination is diffi- 
cult and thus extinction takes longer. 
James and Rotter (1958), in an 
interesting methodological contribu- 
tion to the literature, argue that 
studies of the partial reinforcement 
effect have failed to consider differ- 
ences between those situations in 
which the subject is likely to see the 
occurrence of the reinforcements as 
outside his control and primarily 
contingent upon external conditions, 
and those situations in which the 
subject can relate the occurrence of 
the reinforcements to his own skill. 
They believe that the chance-skill 
dimension is an important one along 
which situations can be categorized. 
Thus, for example, Phares (1957) 
found significantly greater changes in 
expectancies for skill groups than for 
chance groups although all ‘subjects 
had the same number and pattern of 
reinforcements. In research by 
Feather (1959b) this dimension was 
found to be important with respect 
to the influence of variation in sub- 
Jective probability of success on the 
attractiveness of attaining a goal and 
the tendency to choose it. Littig 
(1959) in discussing the results of his 
study of probability preferences and 
subjective probability proposes a 
Similar distinction between skill and 
chance situations. 
James and Rotter (1958) argue that 
in an acquisition series involving 
uniform reinforcement the subject 
receives no cues of nonreinforcement. 
Hence, when the extinction series 
egins, he is able to utilize the first 
Ronreinforcements as cues that the 
Situation has changed. If he per- 
elves control of the situation as 
external, a sudden decrease in ex- 
Pectancy should tend to occur and 
Tapid extinction. But, under partial 
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reinforcement conditions these cues 
of nonreinforcement are present in 
the training series. Hence, when the 
extinction series begins, the first non- 
reinforcement is not a new cue which 
can be used to discriminate a change 
in situation. Consequently, extinc- 
tion should be more gradual, with 
more trials required before the sub- 
ject recategorizes the situation. In a 
chance type of situation resistance to 
extinction is also increased by the 
subjects tendency to count and ver- 
balize relationships. In the acquisi- 
tion series he is likely to develop the 
hypothesis that a series of nonrein- 
forced trials will probably be fol- 
lowed by a trial on which reinforce- 
ment is forthcoming. This "gambler's 
fallacy" may also occur during the 
extinction series. Thus increased 
resistance to extinction should result. 

However, where the situation in- 
volves skill, the subject perceives 
reinforcements and cues as produced 
by factors controlled by himself. In 
the uniform reinforcement situation 
he is less likely (when the extinction 
series begins) to recategorize the 
situation as having changed. Under 
these conditions one would not ex- 
pect sudden decreases in expectancy. 
Furthermore, when the subject is 
partially reinforced in the skill situa- 
tion, the gambler’s fallacy should not 
be operative since nonreinforcement 
is attributed to lack of skill rather 
than to luck factors. James and 
Rotter (1958) maintain that this 
theoretical analysis would lead to 
the prediction that in skill situations 
a 100% reinforcement group would 
be more resistant to extinction than a 
partial reinforcement group. To test 
this prediction they contrasted 100% 
and 50% random reinforcement un- 
der conditions where subjects were 
instructed either that success at the 
task was determined by chance or by 
their own skill. Using a simple card 
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guessing game presented tachisto- 
scopically with chance versus skill 
instructions, they found that under 
skill conditions the usual superiority 
of the partial reinforcement group did 
not occur. In fact, 100% reinforce- 
ment led to less rapid extinction than 
5095 reinforcement, although the 
difference was not statistically sig- 
nificant. The usual partial reinforce- 
ment effect was obtained in the 
chance situation. The 100% chance 
group  extinguished significantly 
faster than the 50% chance group. 


Evaluation of Extinction Studies 


The above investigations by Hum- 
phreys and by Lewis and Duncan 
imply that a high expectation of rein- 
forcement developed in an acquisi- 
tion series of trials will tend to be 
associated with low behavioral per- 
sistence as indicated in rapid extinc- 
tion when rewards are no longer 
forthcoming. Conversely, low ex- 
pectation of reinforcement will tend 
to be associated with high behavioral 
persistence as indicated in a greater 
resistance to extinction. However, 
the study by James and Rotter sug- 
gests that it would be unwise to pro- 
ceed to such a generalization for all 
situations. Usually partial reinforce- 
ment studies have been conducted 
with tasks in which the reinforce- 
ments, when they occur, appear to be 
externally controlled and independent 
of the skill of the subject. In contrast, 
as we saw in the preceding section, 
research in persistence in most cases 
has used tasks which are on the skill 
side of the continuum, where success 
and failure can be related to the 
subject's own efforts and not to con- 
ditions outside of his control. 
Furthermore, persistence tasks of 
the insoluble puzzle variety usually 
provide a wide and relatively inex- 
haustible range of alternative re- 
sponses following failure, whereas in 


the extinction study the response 
tends to be restricted to a particular 
action. Presumably, in the former 
case a number of different alternative 
responses have to extinguish while in 
the latter case only one. 

Finally, it is worth noting again 
the situational emphasis in these 
extinction studies, The stress is on 
varying the way in which reinforce- 
ments occur in an acquisition series 
and tends to exclude consideration of 
the effect of relatively stable per- 
sonality differences on resistance to 
extinction. As indicated, these stud- 
ies may be conceived as more relevant 
to the relationship of persistence to 
situationally elicited expectations. 
Thus, neither the trait studies nor the 
studies discussed in this section take 
account of both personality and situa- 
tion parameters in interaction, and 
their approach is to this extent 
limited. The studies to be discussed 
in the next section do attempt to 
recognize this interaction in their 
theoretical conceptualization al- 
though they vary in the degree to 
which the interaction is made ex- 
plicit. 


PERSISTENCE CONCEIVED AS A 
MOTIVATIONAL PHENOMENON 


The studies in this section conceive 
of persistence in relation to a theory 
of motivation. Two theories are 
particularly relevant to this review: 
Lewinian field theory with its as- 
sumption of behavior determined by 
the psychological life space and all 
that it involves; the theory of achieve- 
ment motivation (Atkinson, 1957, 
1960) with its interactive assumption 
of motivation as a function of mo- 
tives, expectations, and incentive 
values. 

While both of these theories are 
interactional in the sense that each 
considers situational and personality 
parameters, it will be argued that the 
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latter marks an advance over the 
former in the way in which persist- 
ence may be conceived, in that it 
offers a more explicit formulation 
both of individual differences in mo- 
tive and the way in which motives, 
expectations, and incentives are com- 
bined. Therefore it is considered that 
the theory of achievement motivation 
offers greater potential for an ex- 
plicit account of persistence in 
achievement contexts than Lewinian 
field theory. In this section, as in the 
discussion of trait studies, successive 
sets of studies exemplify an advance 
in the evolution of scientific explana- 
tion, in this case from fairly general 
field theory to a more explicit model 
concerned with behavior in the 
achievement situation. 


Lewinian Field Theory 


Lewinian field theory in its basic 
equation B={(P,E) has long recog- 
nized the necessity of considering 
behavior in terms of interacting 
personality and situational factors. 
It is of interest to examine the ap- 
proach taken to the problem of per- 
sistence by this theory. 

The typical situation employed in 
the investigation of persistence can 
probably be represented topologically 
and dynamically in Lewinian terms 
as a frustration situation in which a 
person in a state of tension is sepa- 
rated at some psychological distance 
from a goal (or region of positive 
valence) by a barrier. This barrier is 
the source of restraining forces which 
Oppose the driving forces acting upon 
the person in the direction of the 
goal. The barrier may be objectively 
insurmountable, as when the subject 
is given an insoluble puzzle and 
asked to solve it, or the barrier may 
Tepresent a very difficult task in 
bes case the opposing restraining 
orces would be very strong but could 
Possibly be surmounted. There may 
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be other regions of positive valence 
in the psychological environment to 
which the subject may turn if he so 
desires. 

Itis true that there are some situa- 
tions in the persistence literature 
which do not appear to conform to 
this paradigm. For example, the 
investigation of persistence against 
boredom and fatigue would appear 
to be more related to the Lewinian 
studies of satiation, and dynamically 
different from the frustration type of 
situation mentioned above. But 
Lewin (1946) himself appears to con- 
sider persistence in terms of the 
person-barrier-goal situation when he 
writes: "What is usually called per- 
sistence is an expression of how 
quickly goals change when the in- 
dividual encounters obstacles" (p. 
824). He discusses research by 
Fajans (cited in Lewin, 1946) as fall- 
ing within this context. Fajans, in a 
study of success, persistence, and 
activity in infants and young chil- 
dren, investigated the effect of sepa- 
rating children from a goal object at 
different distances. She found that 
previous failure at a task decreased 
persistence when subjects were again 
confronted with the same type of 
difficulty and when persistence was 
measured by duration of approach. 
In contrast, success led to a relative 
increase in persistence. When the 
same task was repeated, a combina- 
tion of success and praise was more 
effective in increasing persistence 
than success alone. Similar effects of 
success and failure were found by 
Wolf (1938). Persistence was also 
found to increase with decreasing 
distance from the goal. These results 
are relevant to the present review to 
the extent that previous success and 
failure may be considered important 
determinants of present expectations 
of success and failure. Considered in 
this light, the Fajans research implies 
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that persistence is positively related 
to the subject's expectation of suc- 
cess and negatively related to his 
expectation of failure. 

Itisimportant to note that Fajan's 
research, and indeed the great bulk of 
Lewinian investigations, does not 
take account of individual differences 
in both nature and strength of mo- 
tive. While the concept of tension is 
basic in Lewin's conceptualization of 
the person and influences the way in 
which the person sees the environ- 
ment, it is not spelled out in any de- 
tail in the theory. It is a concept 
consistent with the field emphasis of 
a person in an environment each 
influencing the other, but it applies to 
an individual and differences between 
persons with respect to tension are 
not explicitly formulated. One might 
ask how many different types of ten- 
sion there are, whether Individual X 
is stronger than Individual Y with 
respect to one type of tension, how 
these tensions develop, to what ex- 
tent they are stable, what is their 
precise relationship to valence, etc. 
But perusal of Lewinian field theory 
does not provide definite answers to 
these questions. Perhaps this is a 
consequence of the emphasis on the 
individual case, and the protest 
against Aristotelian class concepts. 
In actual experiments it appears to 
lead to operations which are more 
concerned with manipulating the 
psychological environment of a per- 
son, and details of differences in 
personality structure tend to be left 
unspecified. Where the P is manipu- 
lated the operations are directed to 
producing an intraindividual effect 
for the one person rather than study- 
ing the effect of variation between 
persons. 

That personality differences are 
important is recognized by Henle 
(1955) in her research on substitute 
value within a Lewinian context. 

Thus she writes: 


the relevant segment of behavior does not con- 
sist simply of activity directed to satisfy a 
quasi-need to complete a particular task, but 
must be thought of as activity to a more inclu- 
sive goal (p. 536). 


Presumably then we need to consider 
not only transient situational factors 
but also the specific behavior of the 
person in relation to more general and 
inclusive life goals and motives. In 
this sense the theory of achievement 
motivation (Atkinson, 1957, 1960) is 
more explicit than Lewinian field 
theory. It involves not only a con- 
ceptualization of the effects of mo- 
mentary situation (in terms of the 
concepts of expectation and incentive 
value) but also provides for the influ- 
ence of relatively stable dispositions 
(or motives) on behavior. The theory 
is more restricted than Lewinian field 
theory since it is specifically directed 
to the analysis of behavior in achieve- 
ment contexts where performance 
may be related to standards of ex- 
cellence. However, it is possible that 
the general "expectancy value" ap- 
proach, of which the theory of 
achievement motivation is a particu- 
lar example, may ultimately clarify 
our understanding of behavior in 
many other types of situation. 


Theory of Achievement Motivation 


Of the four studies of persistence to 
be considered in this section only the 
latter two studies by Atkinson and 
Litwin (1960) and by Feather (1960) 
are historically an outcome of the 
theory of achievement motivation. 
The other two investigations by 
Winterbottom (1958) and by French 
and Thomas (1958) preceded the 
theory and are more readily classified 
with earlier studies (McClelland, 
Atkinson, Clark, & Lowell, 1953) 
which were concerned initially with 
the development of a valid measure 
of the achievement motive. This 
validation was of two main kinds: 
experiments concerned with the ef- 
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fect of experimental arousal of moti- 
vation on imaginative thought, and 
experiments concerned with the ef- 
fect of individual differences in 
motive strength on behavior. The 
initial approach adopted towards 
validating the measuring instrument 
had a common sense basis. It was 
expected that the achievement mo- 
tive would be more strongly elicited 
under achievement oriented or test 
conditions than in a relaxed situation, 
and that subjects with high achieve- 
ment motive would demonstrate this 
in behavior, e.g., by working harder 
at a task than subjects with weaker 
achievement motive. The studies by 
Winterbottom, and by French and 
Thomas, fit into this class since they 
investigated the common sense pre- 
diction that subjects with high 
achievement motive should have 
relatively higher motivation to suc- 
ceed and hence should tend to show 
greater persistence. 
_ Winterbottom (1958) as part of an 
investigation of the relationship of n 
Achievement to early childhood train- 
Ing experiences, observed 29 8-year- 
old boys in a puzzle solving situation. 
During the test each child was given 
the opportunity to ask for help when- 
ever he wanted it, and was offered 
help and rest at intervals. Using the 
projective thematic apperception 
(TAT) method (McClelland et al., 
1953) of measuring the achievement 
motive, Winterbottom found that 
boys who were high in n Achievement 
on stories obtained immediately after 
the puzzle solution period (i.e., under 
achievement oriented conditions), less 
frequently requested help and more 
often refused an invitation to stop 
work and rest than boys low in 
n Achievement. Boys high in n 
ent under both relaxed and 
e "ME oriented conditions more 
"s TE help even when it was 
fo red. Thus, she obtained evidence 
T greater persistence in the high n 
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Achievement group in the sense of 
desire to continue with the task with- 
out external assistance or rest. In 
common with a great deal of research 
in achievement motivation, Winter- 
bottom defined high and low n 
Achievement groups on the basis of a 
median split in the distribution of 
Scores. 

French and Thomas (1958) in a 
study involving 92 subjects from a 
United States Air Force base, found 
a clear positive relationship between 
time spent on a complicated mechani- 
cal problem and n Achievement as- 
sessed by the French Test of Insight, 
an apperceptive content device scored 
for n Achievement in the same way as 
the TAT (French, 1958). Again, 
high and low n Achievement groups 
were contrasted on the basis of a 
median split. The majority of the 
high motive group used most or all of 
the time available for the puzzle (35 
minutes), while only a few of the low 
motive group continued with the 
puzzle to the end. Thomas (1956) 
had previously found that strength 
of achievement motive was related to 
the length of time a subject would 
work at a problem without objective 
knowledge of progress. 

As we have indicated, these two 
studies were not conceived with any 
theoretical model in mind. However, 
it was natural that, with the accumu- 
lation of research concerning achieve- 
ment motivation, attempts would be 
made to conceptualize results in 
terms of some systematic framework. 
McClelland and Clark (McClelland 
et al., 1953, Ch. 2), in the course of a 
discussion of a more inclusive theory 
of motivation involving the concept 
of affective arousal, had made some 
suggestions about the status of the 
achievement motive with respect to 
this theory. In the Nebraska Sym- 
posium on M. otivation Atkinson (1954), 
drawing on formalizations of expect- 
ancy theory by MacCorquodale and 
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Meehl (1953) and Tolman and Post- 
man (1954), analyzed the role of the 
situation in relation to behavior in 
achievement contexts in terms of the 
expectations which are elicited. This 
line of thinking was further elabo- 
rated in his later analysis of risk tak- 
ing behavior in terms of a theory of 
achievement motivation (Atkinson, 
1957). The general approach of which 
the theory of achievement motivation 
is a particular case, considers motiva- 
tion? expressed in the direction, mag- 
nitude, and persistence of behavior, 
as a positive function of the strength 
of motive within the person, the 
strength of the expectancy of satisfy- 
ing the motive through some action 
instrumental to the attainment of a 
goal or incentive, and the value of 
the specific goal or incentive that is 
presented in a given situation. As 
such, this approach belongs to a class 
of expectancy value theories which 
all involve somewhat similar con- 
cepts (Feather, 1959a). 

The theory of achievement motiva- 
tion is specifically addressed to those 
situations in which performance at a 
task may be evaluated against a 
scale of excellence related to diffi- 
culty. In such a situation, successful 
performance at a difficult level is 
highly valued and is generally ac- 
companied by a feeling of pride in ac- 
complishment, while failure at an 
easy level is negatively valued and is 
generally accompanied by a feeling of 
shame and embarrassment. The 
theory involves the following six 
variables: the subjective probability 
or expectancy of success (P,), the 
subjective probability or expectancy 
of failure (P;), the positive incentive 


?'The theory draws a sharp distinction 
between motive and motivation. Strength of 
motive is one influence on strength of motiva- 
tion. Other factors influencing strength of 
motivation are levels of expectation and mag- 
nitude of incentive value. 


value of success (I,), the negative 
incentive value of failure (Ij), the 
motive to achieve success (M,), and 
the motive to avoid failure (M,y). 
The subjective probabilities refer to 
situationally aroused expectancies in 
the person concerning the probability 
of the consequences of instrumental 
acts. Positive incentives refer to 
potential rewards and goals, negative 
incentives to potential punishments 
and threats. Motives are conceived 
as dispositions within the person to 
approach certain classes of positive 
incentives and to avoid certain 
classes of negative incentives. The 
motive to achieve success (M,) is 
conceived as a disposition to derive 
satisfaction from successful exercise 
of skill; the motive to avoid failure 
(Mar) is conceived as an independent 
disposition to experience shame and 
embarrassment as a result of failure. 
Both of these dispositions are as- 
sumed to be activated when the situ- 
ation arouses expectancies in the 
person that his performance will be 
evaluated against standards of ex- 
cellence associated with success and 
failure. These motives are considered 
to be relatively stable dispositions of 
the person acquired early in life. 
Specific assumptions are made in 
the theory concerning the incentive 
values of success and failure. These 
assumptions were suggested in more 
general form in the resultant valence 
theory of level of aspiration (Lewin, 
Dembo, Festinger, & Sears, 1944). 
There it was assumed that the at- 
tractiveness of success increases with 
decrease in the expectation of success, 
and that the repulsiveness of failure 
increases with decrease in the ex- 
pectation of failure. In addition 
Lewin et al. assumed that the sub- 
jective probability of failure is one 
minus the subjective probability of 
success. The relationship of attrac- 
tiveness of success and repulsiveness 
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of failure to subjective probability is 
made more explicit in the theory of 
achievement motivation. It is as- 
sumed that the positive incentive 
value of success is one minus the 
subjective probability of success: i.e., 
]-1—P, Further, the negative 
incentive value of failure is taken as 
the minus value of the subjective 
probability of success: i.e., Ig — Ps. 

The theory assumes that the basic 
variables combine multiplicatively to 
determine positive achievement moti- 
vation (M, XP, XI,) and negative 
failure avoidant motivation (Mat 
XP;XIj). These two component 
motivations combine additively to 
generate resultant motivation. 

The theory implies that this re- 
sultant motivation to perform the 
task is positive when the motive to 
achieve success is stronger than the 
motive to avoid failure (i. e., Ms > Mas) 
and negative or task avoidant when 
Mat Mi. Hence, an individual in 
whom M Mat should demonstrate 
positive interest in achievement re- 
lated tasks whereas an individual in 
whom Marg M, should tend to avoid 
achievement related tasks unless he is 
constrained to perform them. Pre- 
dictions can be made from the theory 
concerning the effects of individual 
differences in the strength of M, and 
Mer on level of task performance, 
risk taking, and persistence. These 
predictions have been tested in a 
number of investigations summarized 
in a recent review (Atkinson, 1960). 

Of particular interest to the present 
review is the study carried out by 
Atkinson and Litwin (1960). Using 
the theory of achievement motiva- 
tion they predicted that, holding task 
Constant, stronger M, should be as- 
Sociated with greater persistence, and 
Stronger Mat should be associated 
With less tendency to persist. Actu- 
ally the study had a much wider set 
9f aims for it was designed to show 
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the extent to which the resultant 
motivation as predicted from the 
theoretical model would be manifested 
in the defining characteristics of be- 
havior qua motivated, namely, choice 
(direction), performance level, and 
persistence. In particular, Atkinson 
and Litwin were interested in provid- 
ing evidence for the construct validity 
of the French Test of Insight (French 
1958) and the Mandler-Sarason Test 
Anxiety Questionnaire (Mandler & 
Sarason, 1952) as methods for as- 
sessing strength of M, and Mat, re- 
spectively. The construct validity of 
these tests would be strengthened by 
showing that, when they are used for 
assessing the corresponding motives, 
measures of risk taking, performance 
level, and persistence (holding task 
constant in a crude qualitative way) 
are all consistent with implications 
from the theory of achievement 
motivation. 

Using 49 undergraduate students 
enrolled in a sophomore-junior level 
psychology course at the University 
of Michigan, Atkinson and Litwin 
observed their behavior in a simple 
ring toss game as indicative of risk 
taking, the grades they received on 
their final exam in the course as 
indicative of performance level, and 
the amount of time they spent work- 
ing at the final exam as a measure of 
persistence. Taking scores for the 
Test of Insight and the Test Anxiety 
Questionnaire separately and classi- 
fying subjects as high or low for the 
corresponding motive on the basis of 
a median split in the respective dis- 
tribution, they found as predicted 
that n Achievement (Ms) was posi- 
tively related and test anxiety (Mat) 
was negatively related to preference 
for intermediate risk, performance 
level, and persistence. The predic- 
tion was clearly improved when sub- 
jects were classified simultaneously on 
the basis of the two tests using me- 
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dian splits. For example, subjects 
classified high in n Achievement and 
low in test anxiety worked longer at 
the final exam than subjects who were 
classified low in n Achievement and 
high in test anxiety. Subjects classi- 
fied high for both motive measures 
and low for both motive measures 
had persistence times falling between 
those of the two extreme groups. 
Thus, the study provided clear 
support for the construct validity of 
the Test of Insight and the Test 
Anxiety Questionnaire as methods of 
assessing M, and M, respectively. 
Predictions from the theory of 
achievement motivation were con- 
firmed using each test separately and 
the two tests in combination. The 
three dependent variables of the 
investigation were positively corre- 
lated implying that individual differ- 
ences in strength of motive tend to be 
stable from situation to situation. 
It should be noted that the Atkin- 
son-Litwin study was restricted to 
the investigation of persistence at a 
task in relation to differences in 
strength of achievement related mo- 
tives. It made no attempt to vary 
systematically expectations of suc- 
cess and failure as related to situa- 
tional cues or to specify clearly the 
level of initial P. Nor did it attempt 
to account for why an individual 
stops working at a task. A recent 
study by Feather (1960, 1961) fo- 
cuses on these problems for the first 
time and investigates persistence in 
relation to the interaction of motives 
and situationally elicited expectations 
by varying both factors simultane- 
ously. This study therefore helps to 
clarify the more specialized questions 
raised at the beginning of this review 
concerning the relation of persistence 
at a task both to initial expectation 
of success and to individual differ- 
ences in strength of motive to achieve 
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success and motive to avoid failure, 
Hence the approach will be con- 
sidered in some detail. 

The theoretical analysis involved 
in the investigation is applied to the 
persistence situation where the sub- 
ject works at an achievement task 
which is presented to him as part of 
an important test and which is in fact 
insoluble. He undergoes repeated 
failure in his attempts to perform the 
task but may turn to an alternative 
achievement task when he so desires. 
The analysis is based on the relation- 
ship of total motivation to perform 
the task to total motivation to per- 
form the alternative. It is assumed 
that the subject will persist at the 
initial task as long as total motiva- 
tion to perform it is stronger than 
total motivation to perform the 
alternative. 

Total motivation to perform the 
initial achievement task is attributed 
to the following three components: 
motivation to achieve success at the 
task, motivation to avoid failure at 
the task, and extrinsic motivation to 
perform the task. Total motivation 
to perform the alternative activity is 
attributed to the same three com- 
ponent motivations since the alterna- 
tive is also achievement related. 
Both motivation to achieve success 
and motivation to avoid failure are 
conceptualized in accordance with 
the theory of achievement motiva- 
tion as multiplicative products of 
strength of motive, level of expecta- 
tion, and magnitude of incentive 
value. Extrinsic motivation refers to 
motivation to perform the task at- 
tributable to motives other than 
those which are achievement related 
(le., motives other than M, and 
Mat). For example, the usual social 
constraints (e.g., desire for approval, 
fear of disapproval) provide an im- 
portant source of motivation in any 
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situation where a subject is required 
to work at a task.’ Ultimately such 
extrinsic motivation may also be 
conceptualized as resulting from rela- 
tively stable personality dispositions 
(motives) in interaction with more 
transitory situational influences (ex- 
pectations and incentive values). 
Extrinsic motivation to comply with 
instructions may, for example, vary 
with the strength of affiliation motive 
and the degree to which the subject 
expects that compliance will produce 
certain affiliation rewards in the 
situation, e.g., approval for being a 
cooperative subject. In the present 
analysis of persistence, extrinsic mo- 
tivation to perform the initial 
achievement task is assumed to be 
stronger than extrinsic motivation to 
perform the alternative since the 
initial task is presented by the experi- 
menter as the first in a defined se- 
quence of achievement tasks. Fur- 
thermore, both sets of extrinsic 
motivation are assumed to be con- 
stant across the different experi- 
mental conditions. 

It is assumed in the theoretical 
analysis that the subject will turn to 
the alternative achievement activity 
whenever total motivation to per- 
form the initial task becomes weaker 
than total motivation to perform the 
alternative. The problem then be- 
comes that of specifying the basis for 
à decrease in total motivation to per- 
form the initial task as the subject 
works at it unsuccessfully. This 
decrease is assumed to be mediated 
by changes in both the motivation to 
achieve success at the initial task and 
the motivation to avoid failure at 
the task. These changes are in turn 


* This is especially so for subjects in whom 
M. OM. If these subjects are to perform 
an achievement task at all, some stronger 
Positive motivation must exist to oppose their 
tendency to avoid the task. 


assumed to be determined by a suc- 
cessive decrease in subjective proba- 
bility of success (P,) as the subject 
repeatedly fails at the initial achieve- 
ment task. Hence decrease in expecta- 
tion of success with repeated failure 
becomes the basic dynamic principle. 
When this principle is used in con- 
junction with the theory of achieve- 
ment motivation and certain addi- 
tional assumptions‘ it becomes pos- 
sible to derive the following four 
hypotheses: 

Hypothesis 1 states that subjects in 
whom the motive to achieve success 
is stronger than the motive to avoid 
failure (M,>Ma) should persist 
longer at a task for which the initial 
subjective probability of success is 
high (P,>.50) than similar subjects 
for whom the initial P, is low (P, 
«.50). 

Hypothesis 2 states that subjects 
in whom Me M. should persist 
longer at a task for which the initial 
P, is low (P,«.50) than similar sub- 
jects for whom the initial P, is high 
c, 

Hypothesis 3 states that when 
initial P, is high (P,>.50), subjects 
in whom M,»M,: should persist 
longer at a task than subjects in 
whom Mar> Ms. 

Hypothesis 4 states that when 
initial P. is low (P,>.50), subjects in 
whom M,» M, should persist longer 
at a task than subjects in whom 
M. Mar. A 

Thematic apperceptive stories ob- 
tained under neutral conditions and 
Mandler-Sarason Test Anxiety Ques- 
tionnaire were used to provide meas- 


4 It is assumed that P, for the alternative 
task is constant across the experimental con- 
ditions, that reduction in P, for the initial 
task to a particular value will require more 
unsuccessful attempts at the task when P, is 
initially high than when it is initially low, 
and that rate of decrease in P, is not system- 
atically related to strength of M, or Mat. 
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ures of n Achievement (M,) and 
test anxiety (Mat) for 89 male college 
students from an introductory course 
in psychology at the University of 
Michigan. It was assumed that 
M. I Mu for subjects classified high n 
Achievement-low test anxiety (in 
terms of median scores); in subjects 
classified low n Achievement-high 
test anxiety it was assumed that 
Mar>M,. Later, four complex per- 
ceptual reasoning tasks were admin- 
istered to 34 of these preselected 
subjects in individual test sessions. 
The first and third tasks were in- 
soluble; every trial taken by the sub- 
ject at each of these tasks resulted in 
failure. The second and fourth tasks 
were soluble. Initial P, for each of the 
four tasks was induced by reporting 
fictitious norms which indicated per- 
centage of students likely to succeed 
atthetask. For half the subjects the 
successive norms were 70%, 50%, 
5%, 50%; for the other half the 
norms were 5%, 50%, 70%, 50%. 
Measures of persistence were ob- 
tained on the first and third insoluble 
tasks. Persistence was measured by 
total trials or total time the subject 
worked at each insoluble task before 
turning to the next task in the set. 
Results for the first task clearly 
supported the four hypotheses. For 
example, subjects classified as high n 
Achievement-low test anxiety per- 
sisted longer at the first insoluble 
task when it was presented as easy 
(70% norm) than when it was pre- 
sented as very difficult (5% norm). 
This result implies that subjects in 
whom Me Mat show greater per- 
sistence when initial P, is high than 
when initial P, is low which is in ac- 
cordance with the first hypothesis. 
In contrast, subjects classified as low 
n Achievement-high test anxiety 
persisted longer at the first insoluble 
task when it was presented as very 
difficult (595 norm) than when it was 


presented as easy (70% norm). This 
result is consistent with the second 
hypothesis. In the same way, results 
from the first insoluble task are con- 
sistent with Hypotheses 3 and 4. 

None of the expected differences in 
persistence between the different ex- 
perimental conditions occurred on 
the third task but there was a tend- 
ency for changes in persistence from 
the first to the third task, using each 
subject as his own control, to be 
consistent with hypotheses.  How- 
ever, the necessity for unanticipated 
interruption of a sizeable number of 
extremely persistent subjects on the 
first and third tasks, and the pos- 
sibility of uncontrolled sequence ef- 
fects, make interpretation of results 
for the third task equivocal. Post- 
experiment questionnaire information 
concerning assumptions involved in 
procedures and hypotheses tended to 
support the theoretical explanation. 


Evaluation of Motivational Approach 


The type of approach adopted in 
the above investigation raises a num- 
ber of issues that are relevant for the 
future study of persistence: 

1. The theoretical analysis sug- 
gests the importance of specifying 
the components of both total motiva- 
tion to perform the task with which 
the subject is presented and total 
motivation to perform the alterna- 
tive to which he may turn, when at- 
tempting to predict degree of per- 
sistence. The persistence situation is 
similar to a complex problem in 
decision where the subject is continu- 
ally confronted with the choice of 
continuing with the unsolved task or 
turning to the alternative. From this 
point of view the basic theoretical 
problem is that of specifying pre- 
cisely all of the component motiva- 
tions involved in the decision, the 
way in which they combine to deter- 
mine a resultant, and the manner in 
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which they change as the subject 
works unsuccessfully at the task. 

2. The theoretical analysis also 
suggests the possibility of concep- 
tualizing each of the component 
motivations as a function of strength 
of motive, level of expectation, and 
magnitude of incentive value. Such a 
conceptualization would necessitate 
the precise specification and measure- 
ment of the particular motive, ex- 
pectation, and incentive value as- 
sumed to determine any component 
motivation, together with assump- 
tions about how these three factors 
combine to determine the strength of 
the component motivation. It is of 
interest to note that most expectancy 
value theories have employed a multi- 
plicative rule of combination 
(Feather, 19592). 

3. The theoretical analysis focuses 
attention on change in expectation as 
the basic dynamic principle mediating 
change in motivation. The task used 
in the investigation was one in which 
expectation of success could be as- 
sumed to decrease with experience 
since the subject failed at every at- 
tempt. But there are tasks, involving 
some degree of partial success, where 
expectation of success might be ex- 
pected to rise initially before falling. 
Furthermore, it appears important 
to consider the degree to which the 
task involves skill. It is a basic as- 
sumption in the investigation that it 
Should take more unsuccessful at- 
tempts at a task to reduce an expect- 
ancy to a particular level from a high 
initial expectation of success than 
from a low initial expectation of suc- 
cess. The results from the investiga- 
tion and from the earlier study by 
James and Rotter (1958) suggest that 
tt 1$ assumption is tenable in situa- 
tions where the subject can relate 
10 5 to his own efforts or skill, and 
CHE to his own inadequacy rather 

an to the influence of external 


agencies beyond his control. How- 
ever, when the subject considers 
success and failure beyond his control 
(as in chance-type situations), ex- 
pectancy may decrease more rapidly 
in extinction when initial expectation 
of success is high than when it is low, 
an assumption which is consistent 
with the usual partial reinforcement 
effect obtained under external con- 
trol conditions. Clearly then, predic- 
tions about persistence using the 
dynamic principle of expectation 
change require explicit assumptions 
about the manner in which expecta- 
tion changes with experience at the 
activity and assumptions about con- 
ditions affecting the rate of change in 
expectation. 

4, Finally, the theoretical analysis 
suggests by contrast a number of 
persistence situations which differ in 
important respects from the one 
investigated. In particular the fol- 
lowing persistence situations appear 
worthy of study: 

a. Situations where performance 
of the alternative activity involves 
motives which are not involved in 
performance of the initial task. In 
the present investigation both initial 
task and alternative task belong to 
the same class of activity. Both tasks 
are achievement related and total 
motivation to perform each is at- 
tributable to the same component 
motivations. But one can conceive 
of situations where performance of 
the .alternative involves different 
component motivations. For ex- 
ample, in the study of persistence by 
Atkinson and Litwin (1960), the ac- 
tivity in which the subject engaged 
when he left the examination room 
may have been quite unrelated to 
achievement, involving a different 
set of component motivations. 

b. Situations where component 
motivations involve the influence of 
incentive values which are indepen- 
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dent of expectations. In the specific 
achievement context dependencies 
are assumed between incentive values 
and expectations, e.g., I. I =P.. But 
in most other situations it would be 
assumed that incentive values and 
expectations are independent. This 
means that, in the more general case, 
motivation to perform the activity 
would be expected to decrease con- 
tinuously with decrease in the expec- 
tation of goal attainment. This de- 
crease is in contrast to changes as- 
sumed to occur in the achievement 
situation where both motivation to 
achieve success and motivation to 
avoid failure show an initial increase 
in strength as the expectation of suc- 
cess falls to P, . 50. 

c. Situations where the incentive 
is objectively present, e.g., a young 
child trying to overcome a barrier in 
order to get some candy. In the pres- 
ent investigation and also in extinc- 
tion studies the incentive is not in 
view and the experimenter has to 
take special precautions to ensure 
that the subject does not develop the 
expectation that there is in fact no 
incentive, e.g., that there is no solu- 
tion to the problem or that the ap- 
paratus has been fixed“ to prevent 
the occurrence of further reinforce- 
ments. Should this happen the situa- 
tion would become one in which the 
subject sees success and failure as 
beyond his control. 

d. Situations involving different 
assumptions about extrinsic motiva- 
tion. In the present study it was 
assumed that extrinsic motivation to 
perform the initial achievement task 
is stronger than extrinsic motivation 
to perform the alternative. This 
assumption is a reasonable one when 
tasks are presented in a defined se- 

quence with the implication that one 
has to be performed before the other. 
However, one can conceive of situa- 
tions where different assumptions 


about extrinsic motivation could be 
made. For example, extrinsic motiva- 
tion may be assumed constant across 
tasks in the type of situation where 
the subject has free choice among 
various alternatives. The typical 
level of aspiration situation falls into 
this class. Thus, in a ring toss game 
the subject can select any one of a 
number of different lines from which 
to throw the ring. The only con- 
straint is that he make a choice. 
There is no suggestion that he should 
follow a definite sequence in selecting 
some lines before others. There may 
also be situations in which extrinsic 
motivation to perform the alterna- 
tive activity is stronger than extrinsic 
motivation to perform the initial 
task. For example, the experimenter 
might present the first task as a 
practice item and attach little im- 
portance to it. In contrast, the alter- 
native might be introduced as the 
test item and the experimenter might 
suggest that this is the task in which 
he is really interested. Thus, the 
theoretical approach suggests a wide 
array of different classes of persist- 
ence situation. It is believed that the 
kind of interactional approach dis- 
cussed above, based on a motiva- 
tional analysis, will help to elucidate 
the investigation of persistence in 
these other types of situation. 


SUMMARY AND CONCLUSIONS 


The present review has attempted 
to distinguish between different ap- 
proaches made in the literature to 
the study of persistence. As a more 
specialized aim, it has examined the 
relationship of persistence at an 
achievement task to the subject's 
initial expectation of success and to 
the strength of his achievement re- 
lated motives. The survey of the 
literature suggests the following main 
conclusions: 

1. Studies of persistence may be 
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classified into three main classes in 
terms of the extent to which the 
approach adopted is personality ori- 
ented, situation oriented, or con- 
siders both personality and situation 
parameters. 

2. Trait studies of persistence are 
personality oriented and concentrate 
on stable characteristics of the person 
which are assumed to transcend the 
immediate situation and to determine 
some consistency in behavior. This 
type of approach has difficulty in 
accounting for variations in per- 
sistence from situation to situation. 

3. Studies of persistence conceived 
as resistance to extinction are situa- 
tion oriented and concentrate on 
properties of the immediate situation, 
particularly the characteristics of the 
acquisition series. This type of ap- 
proach has difficulty in accounting 
for variations in persistence from 
person to person. Extinction studies 
of the partial reinforcement type 
offer suggestive evidence concerning 
the relationship of persistence to 
expectation but there are problems 
in generalizing these results which 
are related to important differences 
between the typical partial reinforce- 
ment and persistence situations. 
i 4. Studies which conceive of per- 
sistence as a motivational phenome- 
non in general take both person and 
situation parameters into account. 
This type of approach is thus unlike 
the two preceding ones for it has the 
potential of being able to account 
both for variations in persistence 
from situation to situation and for 
Variations from person to person. In 
aiton it allows for the study of 
11055 in interaction. Lewinian field 

€ory, while recognizing the inter- 
8 of person and psychological 
E Ironment in its basic equation, 
8 to deal mainly with varia- 
p sin the latter in its actual experi- 

entation. The theory of achieve- 


ment motivation, developed by 
Atkinson, is more explicit in recog- 
nizing the interaction of stable as- 
pects of the personality (motives) 
and more transitory situationally 
determined influences (expectations 
and incentive values) in determining 
motivation. 

5. A recent experimental study of 
persistence in an achievement con- 
text (Feather, 1960, 1961) is based on 
a detailed analysis of the different 
motivational components involved 
in performance of the initial achieve- 
ment task and performance of the 
alternative achievement task. A 
dynamic principle of decrease in 
expectation of success as the subject 
works at the initial task unsuccess- 
fully is used in conjunction with the 
theory of achievement motivation to 
generate differential predictionsabout 
persistence for subjects differing in 
the relative strength of achievement 
related motives and in initial expec- 
tation of success. Results are in 
agreement with predictions. For 
example, subjects in whom it is as- 
sumed that the motive to achieve 
success is stronger than the motive to 
avoid failure persist longer at the 
initial achievement task when it is 
presented to them as easy rather than 
as very difficult; in contrast, subjects 
in whom it is assumed that the mo- 
tive to avoid failure is stronger than 
the motive to achieve success do just 
the reverse and persist longer when 
the achievement task is presented as 
very difficult rather than as easy. 

6. The study indicates the possi- 
bility of considering persistence as a 
motivational phenomenon, where the 
theory of motivation considers the 
interaction of both personality char- 
acteristics and situationally deter- 
mined influences. The theoretical 
analysis involved in the investigation 
shows: the importance of a’ detailed 
analysis of the component motiva- 
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tions involved in performance of both 
initial task and alternative, the pos- 
sibility of conceptualizing each com- 
ponent motivation in "expectancy 
value" terms, and the importance of 
change in expectation as a dynamic 


N. T. FEATHER 


principle mediating change in motiva- 
tion. The theoretical analysis also 
suggests a number of different types 
of persistence situation worthy of 
further investigation. 
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COGNITIVE LEARNING IN INFANCY AND 
EARLY CHILDHOOD! 


WILLIAM FOWLER 
Yale University 


How early can a child learn com- 
plex cognitive operations? Can a pre- 
school child make substantial prog- 
ress with verbal symbols and other 
abstract concepts? If so, how soon 
and under what conditions? Strangely 
enough, these important questions 
scarcely have been asked in the psy- 
chological literature during the pres- 
ent century. Why? 

In this review weshalltry toanswer 
these and other related questions 
through a critical examination of 
representative studies on cognitive 
learning in early childhood. Our ob- 
ject is to show how the neglect of 
certain key problems has contributed 
to psychology's low estimate of the 
young child's potential for cognitive 
learning. 

Learning in the preschool years has 
generally been known by almost any 
other name than cognitive learning. 
Yet, there is increasing awareness 
that cognition is involved in all men- 
tal processes, from conditioning, dis- 
crimination, and perception to intel- 
lective processes, concept formation 
and, of course, cognitive functioning 
itself (cf. Mussen, 1960). Unfortu- 
nately, these processes often have 
been treated as if they were isolated 
unitary categories of functioning (cf. 
Gibson, 1953; Gibson & Gibson, 
1955; Jones, 1954; Munn, 1954; 
Vinacke, 1951; Wohlwill, 1960). 

It is more logical, in many ways, 
to define even the simplest act of 
perception as a process incorporating 


1 The writer is much indebted to Helen L. 
Koch for her critical reading of an earlier 
version of the manuscript. 
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rudiments of ideation. The abstract- 
ing of attributes is almost necessarily 
integral to perception, while higher 
intellectual activities evolve from a 
process of combining the simpler 
perceptual acts. 

Accordingly, the review will begin 
with a brief comment on the early 
acquisition of simple abilities. After 
devoting some space to motor de- 
velopment (gross and fine), we shall 
then survey the literature on the early 
learning of complex cognitive abil- 
ities. Studies will be organized into 
the following categories: verbal mem- 
ory, language, conceptual processes, 
and intelligence (IQ), and special cog- 
nitive processes—reading, mathemat- 
ical, musical, drawing, and writing 
abilities. There will also be a special 
section covering the relationship of 
early intellectual stimulation to emo- 
tional development. Methodological 
criticisms will appear throughout the 
review and will be summarized in a 
concluding discussion. 


SIMPLE ABILITIES 


The majoritity of studies of per- 
ceptual-cognitive functioning in the 
preschool child do not attempt to as- 
certain more than the status of abil- 
ities at a given age level. The focus 
has been on simple discrimination or 
perceptual problems embracing no 
more than one or two elementary con- 
cepts. Moreover, responses seldom 
have been followed over more than à 
short time span. 

There already are available sur- 
veys of simple abilities of this kind 
(e.g., Gibson, 1953; Munn, 1954; 
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Wohlwill, 1960). We need only add 
here that many perceptual abilities, 
such as those involving discrimination 
of light, sound, color, form, depth, 
and size constancy are reported to be 
well established by 3—4 years of age, 
often before. Evidence of higher 
(second) order conditioning (Marin- 
esco & Kreindler, 1933) and stimulus 
generalization (Jones, 1931) have 
been observed as early as 15 months. 
Transfer of discriminations to other 
sets of stimuli differing on the same 
cues of size and shape has been found 
as young as 18 months (Rüssel, 1931). 

This pattern of findings suggests 
that concept formation begins during 
infancy. The conclusion is supported 
further by studies on problem solving, 
which are particularly relevant here. 
Several investigations show that 
children well under 3 can learn con- 
cepts based on the use of tools for 
solving a problem (Alpert, 1928; 
Matheson, 1931; Richardson, 1932, 
1934; Sobel, 1939). In Richardson's 
experiments, infants of 20-52 weeks 
learned to solve problems of increas- 
ing levels of difficulty. The prob- 
lems required the child to get a lure 
(toy) by pulling a string or to ring a 
bell by turning a lever. 

The ability of 2-year-olds to solve 
à problem based on a principle has 
been demonstrated by Roberts 
. (1932), who used 43 subjects, 2-5 

years of age. The principle involved 
selecting the only door (of three) 
Which matched in color a correspond- 
ung toy airplane suspended over it. 
All of Roberts’ children solved this 
problem, although only those over 2 
Years were able to verbalize the prin- 
Ciple or generalize it to a new prob- 
lem. 
; It is important to note that the 
oregoing achievements can not auto- 
matically be interpreted as ability 
ceilings for the ages cited. To deter- 
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mine these ceilings, even for simple 
abilities, we must probe a child's 
progress in prolonged practice, regard- 
less of age. 

Only Ling and Welch have done 
any work along these lines. Ling 
(1941) found that several months of 
training improved the ability of in- 
fants between 6 and 15 months of 
age to discriminate simple, geometri- 
cal shapes (circle, square, triangle, 
etc.). Infants could discriminate 
forms almost irrespective of changes 
in size or position. 

Welch's work is especially valu- 
able, not only because of his long- 
term training experiments, but also 
because he defined discrimination 
learning in relation to concept forma- 
tion. Welch (1939a, 1939b) initially 
obtained gross discriminations involv- 
ing concepts of size at 12 months 
and form an area by 14 months. 
With prolonged training between the 
ages of 18-28 months, another group 
of infants greatly refined their dis- 
criminations of form and area. They 
succeeded in making discriminations 
as fine as 1 or even .5 inch. The 
mean of the experimental group, at 
27 months, reached a level not at- 
tained by the controls until 53-60 
months of age. Considering the ex- 
tremely young age of Ling's subjects, 
and that Welch's subjects often were 
low IQ institutional children, these 
results promise much for studies of 
learning at more complex cognitive 
levels. 


Motor DEVELOPMENT 


Both perception and cognition un- 
doubtedly feature in motor function- 
ing: the more complex the skills, the 
greater their role. Their involvement 
probably is greater during the initial 
stages of learning, where it is neces- 
sary to identify and coordinate di- 
mensions of a motor task. Since the 
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role of cognition in motor functioning 
seldom has been discussed, selected 
studies from this area will be covered 
in this review. The complexity of 
learning in relation to the duration of 
training will be the primary focus. 

There have been two main types of 
studies of motor processes in children, 
other than those of the short-term 
learning variety. Both of these have 
been typically cast in the framework 
of maturational hypotheses. In one 
approach, producing the greater bulk 
of studies, work seldom has gone be- 
yond defining and measuring a range 
of more or less simple motor skills, 
both gross (e.g., crawling or climb- 
ing) and fine (e.g., apprehending or 
releasing a pellet). Little attention 
has been paid to the influence of ex- 
perience on the development of these 
skills. It usually has been assumed 
that skills are acquired through nor- 
mal" processes of maturation, follow- 
ing a definite sequence of "stages" of 
motor development. Gesell and his 
followers at Yale (e.g., Gesell, Halver- 
son, Thompson, Ilg, Castner, Ames, 
& Amatruda, 1940; Gesell & Ilg, 
1949), Bayley (e.g., 1935), Shirley 
(e.g., 1931), and others have accumu- 
lated much data of this nature. 

The second type of motor investi- 
gation has been designed to determine 
therelative importance of maturation 
and learning in early development (cf. 
Gesell & Thompson, 1929; Hicks, 
1930a, 1930b; Hicks & Ralph, 1931; 
Hilgard, 1932, 1933; McGraw, 1935, 
1939; and others). The introduction 
of moderately long training periods, 
the teaching of relatively complex 
skills, and the use of control subjects 
are all found in this class of studies. 
In these studies an experimental group 
(or identical twin) usually has been 
trained for several weeks or more. 
Controls have then been given similar 

training, butoften for shorter periods. 
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Experimental and control children 
have been alternately kept in a nor- 
mal or deprived environment while 
the matched groups have undergone 
periods of training. In some cases 
(e.g., Hicks, 1930a, 1930b) the con- 
trol group was given no special train- 
ing. 

Nearly all investigators have found 
some evidence that, while training in 
specific skills often improves perform- 
ance, the amount of learning has 
been greater when the training has 
been deferred until the children were 
atleasta few weeks older. They have 
frequently concluded, therefore, that 
age (maturation) is a more vital agent 
than experience in development. 
Many believe that training in skills 
of this type is more efficient as neural 
structures mature—up to a certain 
“optimum” age, which varies with 
complexity of skill. 

Often underplayed, however, is the 
fact that specific training has invari- 
ably produced large gains, regardless 
of whether training came early or 
late in development. In addition, 
partly as a result of the maturational 
bias, there have frequently been 
methodological shortcomings. These 
have further lessened awareness of 
the influence of experience upon de- 
velopment. In the first place, most 
studies have dealt with simpler skills 
for which there is evidence of ability 
"ceilings" or physiological limits. For 
example, in Hilgard's (1932) experi- 
ment on teaching 10 2-3 year ol 
children buttoning, climbing, an 
cutting with scissors, the experiment 
children's learning curve tended to 
fall off toward the end of their 12- 
week training program. Such a ceil- 
ing limits the relative advantage pos- 
sible from longer training. Similarly, 
Shirley (1931) met limited success in 
trying to accelerate the speed of de- 
velopment in walking of her 25 in- 
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fants. Again, are we not dealing with 
a skill tied closer to biology, as 
McGraw (1935) suggests, than is true 
for more complex skills and categories 
of knowledge? 

Interesting evidence on the impor- 
tance training may have in develop- 
ing complex skills is reported by 
Mattson (1933). She found, that in 
subjects ranging from 58 to 72 months 
in age, training in a rather simple roll- 
ing-ball maze yielded little if any ad- 
vantage to the trained group. With 
more complicated patterns, however, 
and as the patterns grew in complex- 
ity, the superiority of the trained 
group increased over that of the un- 
trained. 

Another frequent error in experi- 
mental design has probably produced 
an advantage for the control children. 
Prior to their own training, and while 
the experimental children were under- 
going training, the controls typically 
received a certain amount of experi- 
ence in related motor activities as 
well as in cognition generally. Both 
of these types of experiences would be 
expected to facilitate the subsequent 
learning of the controls. Yet most 
investigators in their conclusions, 
while mentioning these related ex- 
periences, have been inclined to count 
gains accruing from this period as the 
work of maturation alone. It appears, 
then, that maturation and general 
experience have been confounded. 
Moreover, given the stress in the 
American culture upon the value of 
sensory-motor experience, this in- 
fluence upon the development of the 
controls should not be minimized. 

Among other experimental defects 
of these earlier experiments, has been 
the omission of such variables as 
personality and emotional dynamics. 
In contemporary thinking, they are 
widely considered to be highly potent 
influences on the learning process. 
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For example, Gesell's Twin C was 
apparently, from the age of about 1 
year more socially responsive, vocally 
articulate, and emotionally dominant 
over her experimental twinmate 
(Gesell & Thompson, 1929, 1941; 
Strayer, 1930). Theseare traits which 
readily could have enhanced her facil- 
ity for learning. 

Worth noting is the fact that the 
same person characteristically as- 
sumed the role of teacher for both the 
experimental and control groups in 
these experiments. In addition to the 
bias which his knowledge of and in- 
vestment in the experiment gave 
him, his teaching skills for the par- 
ticular training schedules presumably 
improved for the repeat performance 
with control children. All of these 
circumstances probably favored a 
maturational hypothesis. 

In several experiments (Hicks, 
1930a, 1930b; Hicks & Ralph, 1931; 
Hilgard, 1932), the subjects' ages 
spanned periods of from 1 to 4 years— 
periods which loom large in compar- 
ison with the few weeks of training 
allotted the experimental group. In- 
deed, age differences among subjects 
in some instances represented more 
than the total life experience of the 
younger subjects (2-year-olds). Yet, 
experimenters apparently assumed 
that maturational factors in motor 
development worked with similar 
effectiveness on children 2-6 years of 
age. Analysis of age trend variations 
might have proved rewarding. 

There have been virtually no motor 
investigations focused on exploring 
training methods suitable for the 
early years. In Shirley's study, for 
example, we do not know whether her 
didactic methods—coaching and ver- 
bal persuasion —were either appro- 
priate or sufficient for stimulating 
gross motor development. Especially 
in view of the low order of verbal 
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abilities prevailing at this age, dif- 
ferent conditions and techniques may 
be more essential. Among these are 
a permissive, tension-free and play 
oriented atmosphere, together with 
the presence of effective imitation 
models, i.e., skilled peers. 

Studies of motor development, 
even where training has been offered, 
also have failed to measure the effects 
of pre-experimental experience upon 
abilities. Basic generalizations and 
attitudes relative to motor function- 
ing are undoubtedly acquired from 
the first months after birth. Serving 
as indispensable foundations for later 
learning, such schemas (to use Piaget's 
concept) are probably less easily 
acquired or altered at later stages. 
Nor have many experiments taken 
into account a related problem of 
equal significance: the span of train- 
ing must be long enough to permit 
mastery of complicated patterns of 
skills and to accumulate really sig- 
nificant quantities of knowledge. 
Limited training periods do not allow 
sufficient time for the conceptual 
transformation of stable schema. This 
may be of little consequence for 
simple skills, which are characterized 
by low ability ceilings and probably 
require little modification of existing 
schema. It could be critical for 
abilities (whether or not possessed of 
important motor components) which 
require the acquisition of a lengthy 
series of interrelated verbal symbols 
and concepts. 

McGraw's (1935, 1939) well-known, 
longitudinal training study of motor 
processes in twins is, for the length 
of training (2 years), rare among all 
studies of early learning, and unique 
for studies of motor development. 
Although the twins were fraternal, 
the trained twin's clear superiority in 
many skills over his control twin is 
significant. The latter received only 
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2.5 months of training which did not 
start until he reached 22 months of 
age. McGraw, too, found evidence 
that the contrast in ability was most 
marked in complex skills, such as 
roller skating, swimming, and riding 
a tricycle. Cognitive processes were 
presumably more involved in these 
than in the simpler skills like sitting, 
standing, and walking. 

Gains in some skills were even 
permanent, up to the age of 6. These 
were skills where performance had 
reached a high degree of integration 
and mastery. They were also abilities 
not lost as a result of alterations in 
body proportions through growth. 
McGraw believed that the trained 
twin's roller skating ability, for ex- 
ample, was lost because alterations 
introduced new structural elements 
into the situation. In addition, the 
extensively trained twin developed 
better overall muscular coordination 
than his brother, suggesting the work 
of generalizing processes. He also be- 
came more confident than his control 
twin in all motor activities. Clearly, 
more longitudinal training studies 
along the lines of McGraw's design, 
but based on larger samples and 
greater refinement of methods and 
other variables are in order. 


VERBAL MEMORY 


Longitudinal training has been as 
sparsely used for studying verbal as 
it has been for studying motor abil- 
ities. Burtt's (1932, 1937, 1941) un- 
usual and widely cited experiment on 
early auditory memory is the single 
exception in this area. It is a study 
which lends itself to much specula- 
tion regarding the value of early ver- 
bal education. Beginning training 
with his infant son at 15 months, 
Burtt read drama in the origina 
Greek (meaningless to the boy) daily 
until he was 3 years old. Reading one 
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20-line passage less than 1 minute per 
day, he covered a total of 21 different 
passages, 3 new ones being introduced 
every 3 months. At the age of 8.5, 
highly significant differences appeared 
in the ease of learning previously ex- 
posed passages over learning new ma- 
terial of the same kind. At 14, dif- 
ferences were still present although 
less sharp, while by 18 no differences 
were detectable. One wonders how 
much greater the son's progress would 
have been if meaningful passages had 
been used—such as simple stories 
geared in content and syntax to the 
child's level. 

In a short-term memory experi- 
ment, but utilizing meaningful nar- 
ratives, Foster (1928) presented 31 
children aged 2-7 to 4-9 (above 
average IQ) with 10 daily repetitions 
of each of 9 stories in sequence. In 
this way 22 children learned 8 stories 
(of the 9) and 9 children, 3-4 stories. 
Even the younger children learned 
sizable portions of the stories. Al- 
though the quantity learned was cor- 
related with both CA and MA, the 
shape of the learning curves was the 
same at all ages, except for a flat- 
tening during the final story for the 
younger children. This flattening 
may have been because teaching 
methods appear to have been more 
suitable to the motivations of the 
older children. Munn (1954) in his 
review of motor skills states that this 
has too often been the case. Munn 
found little or no evidence that learn- 
ing ability (which he defined as the 
rate of learning) is greater at older 
ages, only that the younger start at a 
lower level. 

These observations point to the 
desirability of beginning verbal stim- 
ulation early and following through 
Consistently. They further indicate 
that methods suitable for stimulating 
children at their own pace and level 
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need to be developed. It also would 
be useful to scale activities according 
to equal units of difficulty at all 
levels. The latter is a problem related 
to that of standardizing IQ tests ac- 
cording to mental age norms. 

Two further investigations on ver- 
bal memory (digits and names of ob- 
jects), namely, those of Gates and 
Taylor (1925) and Hilgard (1933) 
were designed to throw into relief 
maturational influences in develop- 
ment. Both investigators used sub- 
jects 4 years of age or older. The 
tendency for the effects of environ- 
mental factors to be obscured by the 
methodological characteristics of this 
type of design have been summarized 
in the previous section. It is only 
necessary to add that specific training 
again proved of definite value at all 
ages. The fact that the effects of 
practice were erased after a lapse of 
months, on the other hand, could 
reflect the narrowness and conceptual 
simplicity of the subject matter. 
Numerous experiments have shown 
that meaningless material is more 
difficult to recall than meaningful ma- 
terial (e.g., Osgood, 1953). Burtt's 
signal success with material of this 
kind, therefore, may be credited to 
the prolonged periods of training he 
used. 


LANGUAGE 


The language area provides still 
another study where Gesell's twins 
were used. This study was also in- 
spired by an emphasis on matura- 
tional factors. Strayer (1930) stim- 
ulated Twin T verbally (word-object 
associations and simple commissions) 
from the ages of 84-88 weeks (in- 
clusive), while Twin C was confined 
to a similar but language deprived 
environment. Twin C was then given 
4 weeks of equivalent language stim- 
ulation, during which time Twin 
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experienced a normal speech environ- 
ment. 

Over 4 weeks of their training pro- 
grams, C acquired a vocabulary of 30 
words and T learned 23. But the 
difference in rate of learning was 
slight, 1.074 to .936 words per day, 
respectively, for their first 29 words. 
Moreover, in total vocabulary (35 
for T versus 30 for C), sentence 
structure and pronunciation, T was 
perceptibly superior to C, following 
Ts only slightly longer but earlier 
training. Twin T also appeared to be 
gaining in rate over C toward the 
close of the training. The really 
critical variable is contained in the 
additional 5 weeks of general cogni- 
tive (though largely nonlanguage) 
experience, that C accumulated prior 
to beginning training. We must also 
reiterate our other criticisms of this 
kind of design, chiefly, (a) the fact 
that the major progress of both twins 
was obviously associated with their 
specific training and (b) the sig- 
nificance of training was not demon- 
strated effectively because it was too 
brief. This second point would have 
been more dramatically illustrated, 
for example, had T's training been 
continued for 6 months or more, while 
C was confined to a deprived or 
"average" language environment. 

Dawe (1942) has contributed the 
only investigation. using even rela- 
tively long-term language stimulation 
during the preschool years. She used 
older children, ranging from 43 to 82 
months in age. Dawe also studied 
other aspects of cognitive learning 
which will be reported further on. 
Coming from a deprived, orphanage 
environment, the subjects’ mean IQ 
was only 80.6 at the outset. Using 
matched controls of 11 pairs of 
children, Dawe gave the experi- 
mental group about 50 hours of 
training per child in language and 
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general information concepts. Train- 
ing consisted of viewing and discuss- 
ing pictures, listening to poems and 
stories, taking excursions, and mak- 
ing simple observations. On nearly 
all cognitive measures the improve- 
ment of the experimental group over 
their controls was significant at the 
1% level. This included language 
measures of mean sentence length and 
the Smith-Williams vocabulary test. 
Indices of intelligibility and complex- 
ity of organization of speech also rose, 
but not significantly. Considering 
that training was restricted largely 
to weekends and lasted only 3 months, 
compared to the children’s 4—7 years 
of total life experience, gains were in- 
deed sizable. Additional designs of 
this type but embracing younger 
children from more normal back- 
grounds and over longer time spans, 
should be carried out. : 

A study by Milner (1951) offers 
promising retrospective information 
on the value of early language ex- 
perience. Taking those first grade 
Negro children who scored contrast- 
ingly high and low on a language de- 
velopment test, she compared the 
two groups with respect to measures 
of their social background. The high 
language scorers were found to have 
participated much more widely in 
adult family conversation and re- 
ceived more overt demonstrations of 
affection. Other analyses of preschool 
children's histories have also shown 
the importance of adult relations 
and other home variables for the 
development of children's language 
(Smith, 1935; Van Alstyne, 1929; 
Williams & Mattson, 1942). 

Beyond the foregoing studies, inter- 
est has centered on reporting age 
"norms" for acquiring various dimen- 
sions of language. It usually has been 
assumed that these norms directly 
reflect age-linked, maturational levels 
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(cf. Chen & Irwin, 1946; Lewis, 1936; 
McCarthy, 1930; Smith, 1926). But 
developmental norms can as readily 
be subject to environmental as to an 
innate growth interpretation, al- 
though interpretations of the former 
kind are infrequent. To illustrate an 
environmental approach, Ames and 
Learned (1948) record that the big- 
gest jump in the addition of new 
Space concept words occurs between 
2-2.5 years. Is it possible that cul- 
tural stimulation in these concepts 
typically increases around this age so 
as to produce this jump? If so, 
shouldn't we ask whether cognitive 
development in this and other areas 
could not further be enhanced at all 
ages by theintroduction of programed 
stimulation? 
Actually, from an environmental 
perspective, norms of language de- 
velopment in themselves constitute 
evidence of the educability of young 
children in symbolic and other com- 
plex conceptual relationships. Thus, 
Welch (1940a) reported that the 
greatest gains in abstract vocabulary 
occurred at the ages of 18-21 months 
and again during the fourth year. He 
noted (Welch, 1940b) that these 
periods parallel the periods of greatest 
vocabulary gain which Smith (1926) 
recorded. McCarthy (1930) observed 
that her 2-year-olds used a mean of 
‘9% compound and complex sen- 
tences. Several studies (McCarthy, 
1930; Shirley, 1933; Smith, 1926; 
Young, 1941) have found a mean 
Sentence length of 2-2.5 words by 
AE 2.5. The mean sentence length 
of Fisher's (1934) "gifted" group was 
5 words as early as 18 months, 
Ising with age thereafter. In sum, 
b the evidence shows, children learn 
eg from about the end of the 
Euri from then on steadily 
m Plying the number of discrim- 
ions and generalizations they 
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grasp of the language system (Mc- 
Carthy, 1954). Language training 
appears to proceed systematically 
and functionally almost from birth. 
Whatthen of the potential for master- 
ing other cognitive systems, such as 
the reading process or time and num- 
ber concepts? 

There is reason to believe, more- 
over, that language achievement pro- 
ceeds more rapidly than has modally 
been estimated in the past, making 
the preschool child's potential for 
early cognitive learning proportion- 
ally greater. Language definitions 
used and methods of vocabulary 
sampling (the chief language index) 
have both been responsible for in- 
vestigators grossly underestimating 
the size of children's vocabularies at 
all ages (McCarthy, 1954). The chief 
errors have been failure to use an un- 
abridged dictionary, and insufficiently 
representative sampling of many 
categories of experience (Williams, 
1932). 


CONCEPTUAL PROCESSES AND 
INTELLIGENCE (IQ) 


Work concentrating on the nature 
and extent of the generalizing and 
abstracting powers of children's cog- 
nitive development has taken two 
main forms. One form has main- 
tained a strong interest in the elabora- 
tion of theoretical constructs on con- 
cept formation, not infrequently bound 
to specific content areas. This form 
incorporates processes of learning 
through the use of such concepts as 
“assimilation” and “accommodation” 
(Piaget, 1952). However, it tends to 
define development within an ab- 
stract gestalt framework rooted in 
maturational stages. This line of 
study, as represented in the ideas and 
work of Piaget and his followers (cf. 
Piaget, 1926, 1928, 1930, 1952, 1955; 
Piaget & Inhelder, 1956; Piaget & 
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Szeminska, 1952) and of Werner 
(1957) is generally continental Euro- 
pean in origin and character. How- 
ever rich in ideas, the crude experi- 
mental design, sampling, and sta- 
tistical procedures, and the almost 
gratuitous age generalizations are 
well-known. These, along with theo- 
retical criticisms, have been dis- 
cussed in previous reviews (e.g., 
Huang, 1943; Thompson, 1952; 
Wohlwill, 1960). 

Piaget's theories of perceptual- 
cognitive development have never- 
theless stimulated some research on 
perception (cf. Wohlwill, 1960) and 
on the nature of children's thinking 
(Thompson, 1952; Vinacke, 1951). 
Most of the findings (e.g., Deutsche, 
1937; Oakes, 1946; Russell, 1940; 
Russell & Dennis, 1939) have re- 
ported little evidence to support 
Piaget's rigid and manifold classifica- 
tion of thought and emergent stages 
of development. Moreover, although 
formal logic is less characteristic of 
the younger ages, rationality is found 
at all ages and may vary with ex- 
perience almost as much as with age 
(Braine, 1959; Deutsche, 1937;Oakes, 
1946). 

Unfortunately, the consequences 
of long-term training rarely have 
been studied in theoretically derived 
work on conceptual processes at the 
preschool level. The work of Welch 
is an exception. He not only formu- 
lated a theory regarding the nature 
and order of development of abstract 
concepts, but he made efforts to 
transmute his theory into testable re- 
search designs (Welch, 1946, 19472, 
1947b, 1948; Welch & Davis, 1935). 
His studies are not always carefully 
controlled nor clearly reported, how- 
ever. Essentially, Welch conceived 
of each higher order in a conceptual 

hierarchy or system as based on gen- 
eralizations gradually formed from 
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characteristics abstracted from a pre- 
ceding, lower order. In one study 
Welch (1940a) found that his 12 
high average IQ children aged 21-26 
months had acquired a mean of 1.16 
first-order hierarchical concepts; his 
nine 27-33 months old subjects had 
a mean of 3.88. Second-order con- 
cepts appeared in his subjects around 
3.5 years (Welch, 1940a; Welch & 
Long, 1940). 

In one of his attempts at long-term 
training, Welch also tried without 
success over 6 months to teach genus- 
species relations to children 12-20 
months old. Data on IQ and other 
indices are incomplete. Welch re- 
ports that the children's motivations 
became almost completely destroyed 
in the course of training. Welch's 
teaching techniques are often formal, 
however, although he does occasion- 
ally mention in some studies efforts 
to maintain a "game atmosphere." 
This formality is especially manifest 
in another experiment (Welch, 1939c) 
where some 500 repetitions were re- 
quired to condition a child to asso- 
ciate a wooden plate with an arbi- 
trary name “ate.” This is probably 
a difficult, past tense concept for the 
retarded level of knowledge of his 
slightly low IQ institutional subjects. 
Even so, the age period of his sub- 
jects falls exactly within the one 
McCarthy (1930) describes (18-20 
months) as characteristic for learning 
word-object associations. Moreover, 
it seems a little rigid to persist in 
presenting so excessive a number of 
repetitions with a single, arbitrarily 
chosen stimulus term. The use of 
terms which harmonized more with 
cultural experience and a little pre- 
liminary exploration of the children’s 
existing level of knowledge seem to 
have been indicated. 

Despite Welch's formality, he was 
able to teach some verbal generaliza- 
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tions with dimensions of size and 
color to children less than 2 years old 
(Welch, 1939c). As reported earlier, 
he was also successful in imparting 
simpler nonverbal concepts of size, 
form, and area through discrimina- 
tion learning. The combined contri- 
butionsof Welch seem to open a broad 
avenue of possibilities for research on 
early conceptual learning. 

The second form of attack on chil- 
dren's generalizing processes has been 
the study of intelligence through 
mental age and IQ testing. Originat- 
ing mainly in Europe with Binet and 
couched in some theory, thisapproach 
has lent itself admirably to American 
and British empiricism through its 
ready translation of theoretical propo- 
sitions into measurable units of per- 
formance (Anastasi, 1954). The 
theoretical foundations of IQ tests 
(e.g., Stanford-Binet) have tended to 
languish accordingly. 

A heavy interest in the nature- 
nurture of intelligence (IQ) has been 
accompanied by a massive accumula- 
tion of studies, however thin in defini- 
tion the problem has remained. As 
these studies have been covered ex- 
tensively in many reviews (Anastasi, 
1958; Jones, 1954; Wellman, 1945; 
Wellman & McCandless, 1946; Whip- 
ple, 1928, 1940; and others), we shall 
Concentrate principally on studies 
and problems directly relevant to our 
purposes, 

The largest amount of work has 
consisted of cross-sectional studies of 
Children and adults of all ages. In 
these studies, investigators have cor- 
related IQ test performances and 
environmental indices (amount of 
education ; parents' educational, oc- 
Cupational, and social class level; 
urban-rural differences; etc.). A 
multitude of relationships of this 
Pad, often important, have been 
ound. Their value may be considered 
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only partially attenuated by the 
discovery of other associations which 
concede greater importance to hered- 
ity (e.g., between IQ scores of parents 
and their children). The basic prob- 
lem stems from the fact that broad, 
background correlations throw in- 
sufficient light on particular environ- 
mental antecedent-consequent rela- 
tionships. 

Few studies of any kind in the IQ 
area have been designed with the pre- 
cision necessary to illuminate these 
relationships. Some valuable infor- 
mation has come out of the work on 
identical twins reared apart, but it is 
based on sparse case data, usually ob- 
tained through the retrospective re- 
ports of biased informants. With 
radical and lifelong differences in 
their environments, twins have often 
shown marked differences in IQ 
scores in later years, up to as much 
as 24 points. In fact, twins have been 
found to differ in IQ almost in pro- 
portion to the degree of discrepancy 
experienced in cognitive features of 
their environments (Newman, Free- 
man, & Holzinger, 1937). Woodworth 
(1941), tallying data from existing 
studies, derived a correlation of .79 
between differences in amount of 
education and differences in IQ scores. 

Investigations on adopted children 
(mainly, Burks, 1928; Freeman, Hol- 
zinger, & Mitchell, 1928; Leahy, 
1935; Skodak, 1939; Skodak & Skeels, 
1949) provide another good though 
still imprecise source of information 
regarding evnvironmental influences 
on IQ scores. Many of the children 
were placed for permanent adoption 
when only a few weeks old and were 
sometimes followed regularly over 
many years. But IQ measures were 
not always taken during the early 
years and the kind of training the 
children received was never specified 
nor controlled. As a result, the effect 
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of experience, especially early experi- 
ence, upon intellectual development 
remained essentially unknown. 

Among other important methodo- 
logical oversights has been the failure 
to consider the influence of prenatal 
and neonatal factors or of selective 
placement. In addition, sample 
homogeneity has usually been greater 
among the foster children than among 
their controls. These problems are 
discussed at length in other sources 
(Anastasi, 1958; Jones, 1954). The 
typical IQ correlations between foster 
children-natural parents have ranged 
from .4 to .5. But it is evident that 
these correlations represent inflated 
values due to selective placement and 
sample homogeneity of the foster 
children. These two factors also 
have minimized IQ correlations be- 
tween foster parents-foster children, 
further obscuring the influence of the 
environment. Nevertheless, in the 
Skodak and Skeels (1949) longitu- 
dinal study of 100 foster children, the 
mean IQ difference between foster 
children (from the age of less than 6 
months through 13 years) and their 
natural mothers remained consist- 
ently at about 22 points. This is a 
sizable difference to persist in the 
face of methodological factors con- 
ceivably working in an opposite 
direction. 

Historical surveys of “gifted” chil- 
dren, if again more descriptive than 
precise, have furnished additional, 
important clues on the function of 
early cognitive stimulation (Cox, 
1926; Davidson, 1931). The associa- 
tion between cognitive precocity and 
the application of intensive stimula- 
tion from infancy has always been 
impressively high. This suggests a 
need for revising the concept gifted, 
since it is so heavily weighted in favor 
of heredity alone. 

In fact, this writer has managed to 
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assemble from various sources a 
sample of 25 superior IQ children, all 
of whom learned to read by the age of 
3 (Cox, 1926; Davidson, 1931; Dol- 
bear, 1912; Hollingworth, 1926, 1942; 
Jones, 1923; Root, 1921; Stoner, 
1914; Terman, 1917, 1918, 1924). Of 
these, 72% had definitely enjoyed a 
great deal of unusually early and in- 
tensive cognitive stimulation. For 
the other 28% evidence regarding the 
quality and quantity of stimulation 
was lacking in the records. 

Where long-term training has been 
used at all, most programs have taken 
the form of nursery school education. 
These usually have been broad and 
uncertain in meaning and have sel- 
dom concentrated upon verbal de- 
velopment, an important component 
of many IQ tests (Anastasi, 1958; 
Wellman & McCandless, 1946). 

It is worth noting that studies 
which have measured IQ changes 
associated with nursery school ex- 
perience by means of the Merrill- 
Palmer scale have regularly shown 
greater IQ improvements than 
studies using some revision of the 
Binet scales (Wellman, 1945). In a 
Binet measured group of some 1,537 
children from 22 nursery groups, the 
mean increase was 5.4 IQ points, 
compared with .5 points for the con- 
trols (597 children in 14 groups). 
Using the Merrill-Palmer scale, in- 
vestigators found the improvement 
in 267 children in 7 nursery groups 
amounted to a mean of 14.5 points; 
73 nonschool controls in 4 groups 
registered a mean increase of only 6.7 
points. It appears reasonable to 
attribute some of this difference to 
the similarity of most Merrill-Palmer 
test items (pegboard, picture puzzles, 
block building, etc.) to nursery school 
type activities, while Binet items con- 
sist of a greater proportion of verbal 
items, even at the youngest age levels 
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(Anastasi, 1958). Bayley (1949), too, 
observes that many Merrill-Palmer 
test "items test motor skills rather 
than insightful behaviors" (p. 228). 
These characteristics of nursery 
school programs may further help to 
explain why high IQ scorers gain 
fewer points than low or average 


scorers do (Anastasi, 1958). Bright 
children have probably already 
mastered much of the relatively 


limited range of nonverbal, cognitive 
material offered in nursery programs. 
In the same vein, Klatskin (1952) 
studied the effects of good pediatric 
care and permissiveness in child rear- 
ing upon Catell IQ test scores. Her 
group of 184, 1-year-old infants sur- 
passed test norms, except on verbal 
and some fine-motor type items. 
Even when training programs are 
used whose orientation is clearly 
cognitive, however, a problem of 
measurement still remains. Our in- 
struments for measuring intelli- 
gence,” IQ tests, especially those for 
younger children, are relatively 
crude, unreliable, and theoretically 
undefined (Anastasi, 1954). In many 
cases, they yield only a composite in- 
dex, based on a loose mixture of 
verbal and nonverbal tasks, neither 
logically related nor organized into 
types of abilities. Understandably, 
there has been little accurate in- 
formation collected on the pattern- 
ing of preschool children’s abilities, 
not to mention relating variations to 
differences in types of training. For 
example, even with the less verbal 
Merrill-Palmer scale, the learning 
Which is measured is not easily 
traceable to specific nursery activ- 
ities, Barrett and Koch (1930), found 
Score increases in particular test 
items occurred more or less without 
relation to the frequency of participa- 
tion in certain activities in the school. 
In this instance, it appears that the 
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standardization procedures used were 
partly responsible, since the test 
tended to overrate younger subjects 
increasingly to a point and to under- 
rate older ones. 

These limitations of preschool tests 
may also partially answer why corre- 
lations between the performances of 
preschool and older children on IQ 
tests remain persistently low (J. E. 
Anderson, 1939; L. D. Anderson, 
1939; Bayley, 1949, 1955; Honzik, 
Macfarlane, & Allen, 1948). Not 
only is the reliability of preschool 
tests per se low, but even the Binet 
contains fewer cognitively weighted 
items for the younger years. Our in- 
struments measure somewhat differ- 
ent abilities at different ages. This is 
no doubt a by-product of our tradi- 
tional propensity for deriving IQ 
measures empirically on the basis of 
internal consistency and age progres- 
sion (Meyers & Dingman, 1960). 
The spectrum of abilities measured 
also narrows with age. 

It is all the more interesting, 
therefore, to find that, despite the 
poor specification of both training 
and IQ tests, nursery school at- 
tendance has consistently produced 
small to moderate increases in IQ 
scores (Anastasi, 1958; Wellman, 
1945). This has been truer for 
children of less than superior ability, 
and is apparently based more on per- 
ceptual-motor than verbal learning. 

Of all the many studies of IQ, only 
four have developed long-term pro- 
grams of general cognitive stimula- 
tion for preschool children. One of 
these is a study in progress by this 
investigator on three pairs of iden- 
tical twins and a set of triplets. Two 
studies included no children younger 
than 43 months of age. In one study, 
Dawe (1942) followed the progress in 
language and general cognition that 
11 children made over 3 months of 
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weekend training. Details of the in- 
vestigation were reported in the sec- 
tion on language. In addition to 
language gains, her subjects made 
significant mean gains in Binet IQ 
scores (14.2 points against a 2-point 
loss for the controls) as well as in 
other cognitive measures, i.e., science 
and home living information. 

McCandless (1940) attempted to 
raise the intellectual level of 6 very 
bright (139 IQ) preschool children by 
means of an enriched curriculum 
supplementing the regular nursery 
school program. The children's mean 
age was about 52 months. The pro- 
gram consisted of project work (con- 
structing a farm and a combination 
flower store-hotel) presented over 
much of 1 school year. Initial post- 
test differences in IQ scores (1916 
Stanford) between the experimental 
group and their matched controls 
were not significant. However, retest 
differences the following fall reached 
significance on the newly revised 
(1937) Stanford. The retest shift was 
attributed mainly to the nature of 
the 1937 test revision, which cor- 
rected for the decline in age asso- 
ciated with the 1916 Stanford. These 
results may appear more impressive 
when one questions just how stimu- 
lating this kind of project work could 
be for children already of high abil- 
ity—particularly given the Binet’s 
heavy use of verbal symbols by this 
age. 

Peters and McElwee (1944) fur- 
nished 8 months of guided play to 6 
young subjects (aged 2-9 to 4-2) of 
below average IQ who attended 
nursery school three mornings per 
week. Activities were organized ac- 
cording to categories of “functional 
intelligence," although the amount of 
verbal stimulation offered is not 
clear. Home visits and conferences 
with the mother were also included. 


Although there were no controls, 
children registered a mean gain of 7 
IQ points. While not statistically 
significant (.07), this included one 
child whose score fell as a result of ob- 
vious motivational distraction during 
posttesting. Taking these three 
studies together, their relatively well 
outlined and lengthy cognitive edu- 
cational programs contain many im- 
plications for education and the 
meaning of IQ measures. 

The extensive work done on intel- 
ligence in relation to the nature- 
nurture problem has been the subject 
of vigorous discussion and conflicting 
findings over many years (cf. Ana- 
stasi, 1958). But one regrettable con- 
sequence has been the manner in 
which arguments over the relative 
importance of heredity and environ- 
ment have obscured the fact that the 
environment does perform an essen- 
tial role in development. It is difficult 
to argue, considering all the evidence, 
that environment has a negligible role 
in producing mental test variations 
or in developing intelligence and con- 
ceptual abilities, however important 
genetic factors may also be. The dis- 
pute has seriously retarded interest 
in experimental work on preschool 
cognitive learning. It also has dis- 
couraged long range educational pro- 
grams aimed at developing each child 
to near the maximum level of his 
capacities. 


SPECIAL CoGNITIVE PROCESSES 


As to be expected from the evi- 
dence considered so far, there is no 
particular cognitive category which 
has generated much research interest. 
This applies equally to all symbol or 
language systems, exemplified by the 
areas of reading, mathematics, and 
music. Yet, until the late 1800s it 
was common practice in Western cul- 
ture to begin instruction in the 
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"three Rs," reading, writing, and 
arithmetic, as early as age 2 or 3 
(Adamson, 1905, 1930; Brown, 1924; 
Parker, 1912; Raymont, 1937; Ward, 
1901). 

Few careful investigations have 
come down to us but reports are clear 
that schools were few and learning 
limited, even at the older ages 
(Adamson, 1905, 1930; Birchenough, 
1920; Rusk, 1933; Salmon & Hind- 
shaw, 1904; Smith, 1931; Stow, 
1859). This is hardly surprising 
since, at all levels of the infant (ages 
2-7) and grammar school systems, 
many serious deficiencies prevailed 
which persisted well into the last half 
of the nineteenth century (Birche- 
nough, 1920; Forest, 1927; Raymont, 
1937; Rusk, 1933; Salmon & Hind- 
shaw, 1904; Stow, 1859). Curricula 
were narrowly restricted to religious 
dogma and the tool subjects (the 
three Rs), often making use of un- 
simplified texts, including above all 
the Bible itself. Authoritarian dis- 
cipline, enforced by harsh physical 
punishment, was the rule. Teaching 
methods were rigid and tedious, being 
based on rote learning through in- 
cessant drill on isolated elements. 
The lecture system was used freely, 
making little concession to age differ- 
ences. Enormous classes were char- 
acteristic. Infant schools typically 
confined immobile for hours from 50 
to 200 and sometimes as many as 
1,000 undernourished children, crowded 
into galleries. They were watched 
by the petty, severe, and ignorant 
eye of monitors only slightly older 
than the children. Over such a mass 
only one or two adult teachers pre- 
S who were poorly trained, if at 

It is almost astonishing therefore, 
to discover that a number of even the 
youngest children apparently learned 
to read, write, compute, and acquire 
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general knowledge to the extent they 
did under such primitive conditions 
and techniques. Moreover, accomp- 
lishments appear to have been quite 
extensive wherever conditions were 
ameliorated. These better schools 
and tutorial situations were un- 
happily restricted almost entirely to 
children from the wealthier classes 
(Adamson, 1905, 1930; Birchenough, 
1920; Brown, 1924; Parker, 1912; 
Reisner, 1930; Rusk, 1933). 

In any case it was in reaction to 
the generally harsh conditions that, 
over the nineteenth century, the 
ideas and methods of Rousseau, 
Pestalozzi, and especially Froebel 
took hold. They became major 
sources from which the kindergarten 
and the nursery school “movements” 
were formed, leading to a gradual 
revolutionizing of school systems at 
early childhood and other levels 
(Raymont, 1937; Salmon & Hind- 
shaw, 1904; Smith, 1931). 

Child centered programs and per- 
missive guidance techniques con- 
cerned with the rounded development 
of children in socioemotional and 
sensory-motor spheres were substi- 
tuted for academic, subject centered 
curricula. These and other educa- 
tional philosophies as well as the con- 
cepts of molar, developmental, and 
dynamic psychology later contri- 
buted many other specific influ- 
ences. Feeding these philosophies 
have been the ideas of Freud, 
Montessori, Dewey, Gesell, and ge- 
stalt psychology, in addition to pres- 
sures from periodic reform move- 
ments. In this way, wretched condi- 
tions, authoritarian discipline, and 
rigid methods have been slowly re- 
placed. NUR 

Possibly the last major institu- 
tional effort to include a cognitive 
emphasis at the preschool level oc- 
curred in the early 1900s. Montessori 
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(1912) organized a number of schools 
which combined the new, child 
centered principles with the ancient 
subject-matter orientation. These 
were highly successful in teaching 
children as young as 4 years old to 
read and write, and to acquire a 
broad array of simple concepts. 
Since that time, formal cognitive edu- 
cation in early childhood has ceased 
to flourish (Raymont, 1937; and 
others). It exists now only in those 
isolated instances where preschool 
children continue to be taught for- 
mally in scattered private schools 
(usually 4-year-olds in kindergarten) 
or informally in a home setting. 


Reading Ability 


Nearly all cases of early reading 
come to light from historical sources 
or from surveys of gifted children. 
Information is most detailed in the 
accounts of children taught individu- 
ally in the home by tutors, parents, or 
other relatives (Cox, 1926; Davidson, 
1931; Dolbear, 1912; Hollingworth, 
1926, 1942; Root, 1921; Sidis, 1911; 
Stoner, 1914; Terman, 1918; Witte, 
1914). These accounts prove to be 
one of the best stimulants for en- 
couraging research on cognitive learn- 
ing in early childhood. Methods are 
sometimes fairly well described. 
Much of this early childhood educa- 
tion has been devoted to general 
knowledge, and many "home edu- 
cators" have shared a strong belief in 
intensive education from birth as a 
sine qua non for the ontogeny of high 
intelligence. 

Winifred Stoner (Stoner, 1914; 
Griffiths, 1923) was given a carefully 
graduated program of reading and 
wide general knowledge by her 
mother from the age of 6 months on. 
By 16 months she could read and 

spell, by 2 years she could write. No 
definite degree of proficiency is speci- 


fied until age 3, when she had re- 
portedly acquired much skill in all 
and was beginning to typewrite. The 
child otherwise participated actively 
in physical and dramatic peer play 
and from childhood on experienced 
much school and some literary suc- 
cess. Although a controversial lay 
educational figure, the mother en- 
joyed considerable support from pro- 
fessionals. Her reading instruction 
methods (Stoner, 1914, 1916) in some 
ways typical of many home educa- 
tors, consisted primarily of syn- 
thesizing words, using phonic prin- 
ciples, along with reading simple 
texts. Letters were presented ini- 
tially in enlarged, colorful cutout 
forms and learning was further rein- 
forced through an endless variety of 
songs, plays, games, and stories. 

The case of Viola Olerich (Dolbear, 
1912) is of interest in that she was 
adopted at 8 months by a professor, 
Olerich, expressly for demonstrating 
his ideas on the early development of 
abilities. He succeeded in teaching 
the child to read by a sentence 
method shortly after the age of 17 
months. By the age of 2-11 she could 
read with force and suggestion“ al- 
most any reading material presented 
to her. 

An earlier illustration is furnished 
by a German clergyman, Witte 
(1914), who, before the son's birth in 
1800, predicted the high intellectual 
abilities which his son later acquired 
with the help of painstaking program- 
ing from infancy. The child learned 
to read and spell fluently well before 
5, through an  alphabet-phonic 
method. He also became an excep- 
tionally thoughtful and well-in- 
formed, yet modest child. Attaining 
his doctorate easily at 14 he later 
rose to eminence as a leading univer- 
sity scholar in literature. 

Many other examples of preco- 


LEARNING IN INFANCY AND CHILDHOOD 


cious readers could be cited. Some of 
these, like Macauley, John Stuart 
Mill, Francis Galton (Cox, 1926; 
Terman, 1917), and Norbert Wiener 
(1953), came to be renowned thinkers 
of their time. All of them experi- 
enced intensive cognitive education 
from the earliest years. 

Experimental investigation of early 
reading is restricted to some five 
studies, plus a sixth on which this in- 
vestigator is working. Of three early 
studies, Davidson's (1931) study on 
3-, 4-, and 5-year-old children was the 
most extensive. The children were 
divided by age and ability into 3 
groups of 4-5 children each, all hav- 
ing an MA of 4 years. The mean IQs 
of the respective groups were 128, 98, 
77 on the 1916 Stanford. 

Four months of individual and 
group reading instruction were 
offered 10-15 minutes daily in a 
morning kindergarten type program. 
A word-sentence method was used, 
with the aid of play activity, mo- 
tivating techniques, but drawing 
some attention to structural detail. 

In the course of training, nearly all 
children made definite headway in 
learning to read simple primer ma- 
terial. Two of the bright 3-year-olds 
covered the equivalent of a first 
grade program in less than half a 
school year and were rapidly becom- 
12 fluent readers. Moreover, at 
CR one (with signs of another) of 
i 4-year-olds of average IQ was on 

€ verge of fluent reading, while one 
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dull 5-year-old was making sub- 
stantial progress. 

An outstanding feature of this ex- 
periment is that these results were ob- 
tained even with average and low IQ 
children well before the usual school 
age of 6. Another is that mean IQ in- 
creases of 9.2, 9.75, and 10.75 points 
were obtained by the respective 3-, 
4-, and 5-year-old groups over so rela- 
tively brief a period. Still another 
emerges in the fact that extensive 
cognitive stimulation at home (al- 
most daily parental story reading) 
had been experienced only by the 
best readers (all four bright 3-year- 
olds). In contrast, only one other 
child had been read to more than 
occasionally—and she was the most 


successful 4-year-old reader. 


In another early study, Brown 

(1924) experimented on short-term 
word training (nouns only) with 12 
children, aged 3-6 to 5-11 and with 
another child of only 22 months. 
She followedastencil-tracing method, 
together with the use of printed flash 
cards, posters, and a primer, sustain- 
ing motivation by the use of games. 
Two children, a girl aged 3-10, whose 
IQ was only 91, and a boy of 3-6, 
whose IQ was 140, were given 20- to 
30-minute individual reading lessons 
in their homes, 5 days a week for 3 
months. 
Catherine, in 36 lessons, learned to read 11 
pages of her primer. Roger in 26 lessons, 
reached the same point. Both children could 
read their charts; and the recombination of 
words upon the charts, with perfect ease 
(Brown, 1924, p. 135). 


With respect to short-term word 
learning, even the child of 22 months, 
whose IQ was only 118, learned to 
discriminate three words in only five 
lessons. She still recognized them 3 


months later. 
One of the most remarkable and 


successful experiments in early read- 
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ing was performed by a lay educator. 
It is included here because of its un- 
usual verification and documenta- 
tion, reported by Terman (1918). 
This anonymous father began to 
teach his daughter, Martha, to read 
from the age of 14 months, when her 
speech was limited to just three 
words. He first taught her the capital 
letters, so that the child's fourth 
spoken word was "pretty B" 
(pretty, being the third). Otherwise 
he followed a sight word method, em- 
ploying play oriented techniques and 
fabricating simplified sentence charts. 

Success was far from immediate, 
however. The next five months were 
passed primarily in teaching the 
child word-object relationships. Ap- 
parently, at 19 months certain im- 
portant concepts developed, for 
Martha rather suddenly in 1 month 
rapidly mastered, first, all capital 
then the small letters. At 21 months 
and 10 days, with a reading vocabu- 
lary of 35 words, she made some 
generalization to the effect "that 
these word pictures represented 
thoughts" (p. 223) occurred. At 23 
months Martha “began to experience 
the mental pleasure of reading,” 
(p. 225) and had acquired a reading 
vocabulary of 150 words. By 26.5 
months, she had covered 4.5 primers, 
embracing a vocabulary of over 700 
words. She could read fluently and 
meaningfully, but her pronunciation 
was less perfected. 

This case is unique in providing 
well-verified evidence of how a child 
of not more than 140 IQ and MA of 
just 3-0 years (extrapolated from a 
1916 Stanford given at age 3-11) 
learned to read. 

Of the three recent investigations, 
complete data is available only on 
one, an extensive monograph by 
Fowler (in press). He reports on a 9 
months’ exploratory effort to teach 
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his 2-year-old daughter, Velia, to 
read, experimenting with a combina- 
tion of methods in a trial-and-error, 
but carefully regulated approach. 
Stimulation averaged 30 minutes 
daily and was freely immersed in 
dramatic play and family routines. 

During the program the subject ac- 
cumulated an out-of-context, recog- 
nition vocabulary of about 250 
words. Words tested had not been 
exposed to the subject for a mean of 
18.5 days and were recognized with 
about 87.3% accuracy. Toward the 
end Velia could read meaningfully a 
few isolated, primer type sentences, 
2-5 words in length. She was also be- 
ginning to read preprimers, and 
could perceive words with increasing 
precision using phonic cues. 

The child had also experienced a 
planned, broad program of intensive 
cognitive stimulation from her ear- 
liest weeks. This experience is be- 
lieved to have been a major factor 
developing Velia’s IQ to the high 
range (150-170) within which Velia 
scored on all tests from age 2 to 8. 
Moreover, in preliminary pilot work, 
she learned to identify with ease all 
capitals at 21 months and most small 
letters by 23 months. 

Among techniques and dimensions 
explored at some length were dra- 
matic play, reinforcement proce- 
dures, phonics, whole-word learning, 
and the effect of varying type style 
and size.  Picture-association an 
various other materials and methods 
of presenting stimuli were similarly 
experimented with. 

Follow-up tests and observations 
to age 8.5 have repeatedly revealed 
Velia to be reading avidly with com- 
prehension and facility 2-3 grades 
above level. In addition, she con- 
tinues to pursue enthusiastically a 
wide variety of other cognitive and 
social interests. It is worth recording 
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that she is less advanced in certain 
cognitive areas, e.g., arithmetic, 
where she has received little special 
stimulation. Emotional problems are 
covered separately under psycho- 
social development. 

Investigations in progres by 
Moore and Anderson (1960) on early 
reading and writing, are as yet pub- 
lished only in film form. From 
limited information,’ it appears that 
several children, aged 3-5, have made 
definite progress in learning to read 
and write. One child, who started at 
2-7, apparently reached a second 
grade level 6 months later. Of special 
interest is their method involving the 
use of an electric typewriter (using 
correct fingering), which probably 
enhances motivations as well as 
facilitates learning. 

Studies of reading instruction in 
the field of education have contrast- 
ingly, in contemporary times, been 
mostly restricted to grade school 
children. We find only an occasional 
study of reading (readiness) gradi- 
ents, for instance, which descends to 
the preschool years (e.g., Ilg & Ames, 
1950). Despite prevalent use of the 
concept of "reading readiness" (or, 
generically, “learning readiness"), 
which theoretically allows for wide 
individual differences in the age, rate, 
and facility for learning to read, in- 
Struction in practice is basically 
linked to age norms. Reading con- 
tinues to be introduced at 5-7 years 
of age for almost all children, regard- 
less of ability, when the child enters 
elementary school (unless he is taught 
at home) (cf. Betts, 1957). 

Aside from the role of recent 
traditions, this practice parallels a 
Predominant educational view (cf. 
Anderson & Dearborn, 1952; Dolch, 


eed K. Moore, personal communication, 


1955; Gates, 1937; Monroe, 1951; 
Morphett & Washburne, 1931). It is 
widely asserted that children experi- 
ence difficulty in learning to read 
below a certain MA level, usually 
figured at about 5-7 years. 

Taking a longer time span, the 
paucity of interest in early reading 
and cognition also may well be a by- 
product of theoretical viewpoints re- 
lating to the nature of the young 
child's emotional development, as we 
shall discuss shortly. These views 
are believed to have arisen as part of 
a strong and persisting historical 
reaction to the traditional authori- 
tarian and rote methods of education 
described earlier. 

Both the evidence we have 
sented and Davidson's (1931) survey, 
on the other hand, indicate that, 
historically speaking, many 4-year- 
olds (MA) have learned to read 
fluently. Moreover, an estimate by 
the writer, based on data compiled 
from studies of intellectually superior 
children (Barbe, 1952; Cox, 1926; 
Davidson, 1931; Hollingworth, 1926, 
1942; Ilg & Ames, 1950; Miles, 1954; 
Terman, 1924, 1925; Witty, 1930, 
1940), suggests that perhaps .01% of 
the general population learns to read 
as early as 4 years (CA). Many of 
these children are ordinarily classified 
as gifted, but where records are ade- 
quate all precocious readers received 
a great deal of prior stimulation. 
Consequently, to label a child as 
gifted in no way dispenses with the 
necessity of stimulation—if he is to 
learn. However genetically bright a 
child may be, development does not 
proceed by maturation alone. It 
should further be re-emphasized that 
both Brown (1924) and Davidson 
(1931) made real headway in teach- 
ing even 3- and 4-year-olds of average 
IQ how to read, while Davidson made 
progress with low IQ 5-year-olds. In 
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sum, it would appear that in the 
theory and practice of educating 
young children we have been missing 
both the MA and CA marks by a 
sizable margin. 


Mathematics 


Because of its specialized subject 
matter and symbol systems, mathe- 
matics is relatively independent from 
other branches of knowledge. In 
arithmetic, this peculiarity has been 
considered one reason for the de- 
velopment of certain precocious 
“wizard” calculators (Révész, 1940). 
These bizarre children often have 
been contrastingly undeveloped in 
other areas, including general mathe- 
matics itself. In extreme form, one- 
sided prodigies have been termed 
"idiot savants." The phenomenon 
also is found in musical abilities and 
sometimes in drawing, mechanics, 
memory, or motor coordination 
(Barlow, 1951; Rife & Synder, 1931; 
Tredgold & Soddy, 1956). On the 
other hand, many mathematical, 
reading, and musical prodigies have 
been well-rounded in ability. Usu- 
ally, the mathematical prodigies have 
excelled in mental arithmetic and 
many of them apparently began their 
training prior to age 3 (Barlow, 1951; 
Bell, 1937; Mitchell, 1907; Rife & 
Snyder, 1931; Scheerer, Rothman, & 
Goldstein, 1945; Scripture, 1891; 
Tredgold & Soddy, 1956). According 
to Mitchell (1907) and Révész (1940), 
precocity has been the rule for pro- 
digious calculators, exceptional abil- 
ity usually being identified at about 
5-5.5 years in 75% of the cases. 

Unfortunately, little is known of 
the environmental origins of these 
abilities, especially concerning higher 
mathematics. Even less is available 
on teaching methods, beyond occa- 
sional comment, for instance, that a 
boy putatively learned by counting 
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pebbles or peas as a means of passing 
the hours while tending sheep. The 
prevailing genetic orientation of ear- 
lier investigators may be partly re- 
sponsible for this lack of interest in 
the details of learning. A useful ex- 
ception is provided by Stoner (1914, 
1916), who reported that her child 
could, for example, count change by 
age 2. Stoner furnishes ample detail 
on teaching procedures, most of 
which call for games and similar 
motivating devices. 

The frequent unevenness in ac- 
quiring abilities also may be ex- 
plained as a contrast phenomenon. 
Perceptually, this would inflate gains 
achieved in one (or two) successful 
areas, when found in an individual 
of otherwise low abilities. The retro- 
spective basis and sketchiness of 
most case records would tend to pro- 
mote this effect. On the other hand, 
the observed disparity in well-docu- 
mented cases has led to a number of 
theories, nearly all of which accord 
some place to experience. Some 
emphasize the unequal inheritance of 
abilities; others stress that pathology 
(emotional and/or organic) may be 
implicated (Scheerer, Rothman, & 
Goldstein, 1945). What may be for- 
gotten is that the extent to which en- 
vironmental factors can develop abil- 
ities unequally is a measure of the de- 
gree to which abilities can be learned 
independently of one another. 

That long periods of learning are 
indispensable for the early acquisition 
of calculating and similar abilities is 
suggested by the following: case rec- 
ords, although often less complete 
than those on early reading, similarly 
point to years of exceptional, early 
stimulation and practice as the rule 
(Bell, 1937; Jakobsson, 1944; and 
others). Second, there is much evi- 
dence to suggest that the preschool 
child’s successful efforts to attain 
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high abilities have frequently been 
early and liberally reinforced and 
channelized by such social role labels 
as "wizard" or "genius" calculator 
(Barlow, 1951; Mitchell, 1907; and 
others). Finally, it is unreasonable 
to believe that even the simpler“ 
mathematical concepts, thousands of 
years in the attainment historically 
(Sarton, 1952, 1959), could emerge 
autogenously in untutored children, 
following Dennis' and Dennis' (1951) 
notion. 

Yeteven contemporary research on 
the numerical abilities of preschool 
children, as in other cognitive areas, 
is still highly concentrated on merely 
registering the ages and order of 
normal development. It is as if the 
development of abilities does indeed 
evolve autogenously or through ma- 
turation alone. In fact no experi- 
ments were found which were di- 
rectly concerned with how, or the de- 
gree to which, preschool children can 
learn quantitative relationships and 
numbers. Nor do educational theory 
and practice make any provision for 
teaching arithmetic before age 5, any 
more than they do for reading. 
Similarly, MA concepts of arithmetic 

readiness" have, for elementary 

school programs, been interpreted 
rigidly in terms of age norms (e.g., 
Harding, 1959; Washburne, 1928). 

The observations of Ilg and Ames 
(1951) illustrate this maturational 
outlook. Developmental gradients on 
Several aspects of the number process 
at 6-month intervals from the age of 
l year through 9 years have been 
catalogued. n their usual fashion 
little attention is devoted to learning 
processes, mental growth being the 
descriptive framework. Similar col- 
lections of developmental observa- 
tions on number concepts in early 
childhood have been assembled by 
Descoeudres (1946), Giltay (1936), 
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and Long and Welch (1941). Piaget 
and Szeminska (1952) have also 
worked in this area, although they 
are more concerned with theory than 
with facts. Probably the most im- 
portant modal shift in the acquisition 
of number concepts involves moving 
from simple perception of “oneness” 
and “‘twoness,” or broad discrimina- 
tions of quantity, to comprehension 
and accurate perception of three or 
more units. This takes place usually 
around the age of 3.5. But nearly 
every study finds some children who 
make this basic transition toward 
functional mastery of the number 
system at the age of 3 or younger 
(and some at older ages). Therefore, 
unless the explanation is assumed to 
be totally a genetic one, the now un- 
probed differences in learning oppor- 
tunities should be investigated. This 
point is brought into sharper relief by 
Ilg's and Ames’ (1951) observation 
that adding behavior begins pri- 
marily at 5 and 6, precisely the ages 
when most children enter elementary 
school. 


Musical Ability 


Like mathematics, musical struc- 
ture is also comparatively inde- 
pendent of other categories of cogni- 
tion. Music, too, has frequently been 
learned early. But until the inception 
of the kindergarten movement, 
music, unlike the three Rs, was never 
prominent in most infant school pro- 
grams (Adamson, 1905; Raymont, 
1937; Reisner, 1930). Asa result, 
source material on the early develop- 
ment of musical abilities is largely 
biographical (e.g., Barlow, 1951; 
Bowerman, 1947; Maazel, 1950). 
The records show that the family (or 
tutors) has been the primary agent 
for musical education, at least until 
later in childhood. Teaching meth- 
ods although seldom adequately de- 
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Scribed, have often been surprisingly 
formal from a very young age. 
Stoner (1914, 1916) in the music field 
was one of the innovators in deviat- 
ing from this formality, which of 
course simply paralleled the rigor 
traditionally used in all instruction. 
She adapted both vocal and instru- 
mental (violin) learning to her young 
daughter's levels of development, 
cognitively and dramatically. As we 
shall see from a survey of musical 
child prodigies, much early stimula- 
tion has taken the form of immersion 
in a rich musical milieu. But formal 
or informal, no early achievements 
are recorded (where records are de- 
tailed) without unusual stimulation 
having preceded these accomplish- 
ments. 

To mention a few well-known ex- 
amples (Maazel, 1950), Mozart, born 
into an intense family musical atmos- 
phere, began playing at about 3 
years, receiving formal lessons at this 
age. He began composing as early as 
age 5, attracting public attention 
soon after. Yehudi Menuhin was 
performing as a violin soloist with the 
San Francisco Symphony by 6; he 
had begun lessons formally at 3. 
Mendelssohn played in public at 9 
and composed fluently at 11; he had 
begun formal piano instruction at 
age 4. Josef Hofmann started lessons 
at 3.5 and performed in public at 6. 
Haydn began formal lessons at 5, 
having been exposed earlier to much 
informal stimulation, and was able 
to sing many songs from an early age. 
Jascha Heifetz accompanied his par- 
ents to concerts from infancy, took 
lessons formally from the age of 3 and 
played his violin in public by 5. 
Sandra Berkova displayed little sign 
of musical talent prior to beginning 
systematic violin instruction at 2.5 
with her mother, herself a concert 

violinist. Berkova played her first 
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concert (of 14 pieces) at 3.5, making 
a debut with the Los Angeles Sym- 
phony at 5. Both Heifetz and Ber- 
kova learned with the aid of an inter- 
esting teaching device; they played 
initially on a miniature violin (not a 
toy), fabricated to their size. Among 
many other cases which could be 
cited, Gaw (1922) reports a child who 
could hum tunes at 1 year and sang 
publicly at 4. She started piano 
lessons formally at 5 and later be- 
came a concert accompanist. Paral- 
leling other accounts where reporting 
is complete, this child was exposed to 
constant musical activity practically 
from birth. 

There also has been some experi- 
mental work done on the early de- 
velopment of musical skills, although 
less than on early reading. Curiously, 
there have been fewer efforts to tabu- 
late developmental norms for this 
area. In one such tabulation, how- 
ever, Jersild and Bienstock (1934) 
concluded that many children as 
young as 4 have developed such 
abilities as, for example, being able to 
sing a range of tones equal to those of 
the average adult. Two studies 
(Gaw, 1922; Stanton, 1922) at Iowa 
surveyed musical ability in the 
schools, using the Seashore measures 
of musical talent. Both were con- 
cerned with weighing the relative 
contributions of heredity and envi- 
ronment. Data were restricted to 
frequencies and percentages com- 
piled from rather incomplete and ret- 
rospective case material. Although 
the authors drew conclusions in favor 
of heredity, inspection of the data re- 
veals that the same pattern of find- 
ings is easily consistent with environ- 
mental explanations. The association 
between the children's abilities and 
their musical experience was poorly 
controlled. 

Four experimental studies, two of 
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them well controlled, have succeeded 
admirably in teaching young children 
vocalabilities. Jersild and Bienstock 
(1931) taught 18 children from 30 to 
48 months of age to sing songs 10 
minutes twice weekly over a stretch 
of 6 months. The experimental 
children scored significantly higher on 
pitch and intervals tests than their 
untrained controls. Four months 
following the training, the practice 
group was still very much in the lead. 
Updegraff, Heiliger, and Learned 
(1938) report similar, highly positive 
results for their controlled study. 
Jersild and Bienstock (1934) later 
gave only 6 weeks training to 23 
children aged 3-8.5 years, but used no 
controls. They reported that some 
permanency of gains was retained 
over a 2-year period. They con- 
clude from their several studies that 
early training is advantageous to 
musical ability. If ability is not 
learned early, they believe “habits of 
disuse” may occur which inhibit 
learning at older ages. 

Hissen (1933) in a similar study 
produced evidence that children aged 
21-54 months improved in tonal 
discrimination and accuracy of tonal 
reproduction with training lasting 
from 1 to 2 nursery school semesters 
(10-20 lessons). In one experiment, 
Colby (1935) failed in teaching 3.5- 
4 year old children to play a tin fife, in 
Part because the fingers of the young 
children were too short for the instru- 
ment. Fabrication of a miniature in- 
strument, equivalent to the violins of 
Heifitz and Berkova, might have re- 
solved this difficulty. 

Musical abilities in early child- 
hood have been only superficially 
Studied. There are nevertheless 
Many positive findings, both his- 
torical and experimental, on benefits 
of early, long-term stimulation. Ex- 
tensive experimental work is neces- 
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sary to substantiate and expand 
these findings and to establish a pool 
of information on techniques of 
musical training. 


Drawing and Painting 


The medium of drawing is perhaps 
not ordinarily thought of as a cogni- 
tive process based on symbols and 
abstract concepts in the same way 
mathematics and conventional lan- 
guage are. This may be because, in 
addition to being classified as a less 
definable artistic or creative activity, 
the finished product has often tended 
to “look like" our visual perception 
of an aspect of reality. The fact of 
selectivity and the existence of 
techniques and rules for realizing 
different modes of expression in art 
work, however, point to the involve- 
ment of abstract, symbol systems 
(Robb & Garrison, 1935). Techni- 
ques for handling 3-dimensional per- 
cepts on 2-dimensional planes or for 
handling problems of chiaroscuro are, 
for example, highly complicated. 
The wide range of differences 
achieved in ability levels further 
suggests the complexity of learning 
implicated (whatever the role of in- 
herited talent). It is, moreover, prob- 
ably for these reasons that drawing 
tasks have been utilized in several 
IQ tests (e. g., Goodenough, 1926; 
Jaffa, 1934; Terman & Merrill, 1937). 
But, as we have indicated, rules and 
methods for learning to draw and 
paint, are usually less formally de- 
fined or widely agree upon than, say, 
those for the structure of mathe- 
matics and music. This absence of 
formal definition may help to ac- 
count for the dearth of reports on 
precocious painters found in his- 
torical sources. (s 

Most research on drawing ability 
and related visual art forms has over 


many years (Goodenough, 1926) 
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rarely stepped beyond an interest in 
describing typical developmental se- 
quences. Monroe (1929), for ex- 
ample, has chronicled a series of 
stages of progression, starting at age 
2, along a continuum of the degree to 
which the child has learned to repre- 
sent reality in his spontaneous draw- 
ings. In general, drawing as a more 
complex conceptual process tends to 
emerge (according to the norms) 
around the age of 3, when the child 
begins to draw ideoperceptual forms 
planfully as opposed to the random 
production of scribbled lines and 
loops. Goodenough's (1926) MA and 
IQ test is based on the extent of com- 
plexity and realism the child utilizes 
in drawing the figure of a man, but 
descends only to age 3.5. Gesell and 
his followers (e.g., Gesell et al., 1940; 
Gesell & Ilg, 1949) have included 
drawing skills in their collections of 
developmental norms. Thus, for 
example, Gesell and Ames (1946) find 
certain directions prevail in the way 
a child draws at a given age. 

Dubin (1946), on the other hand, 
was successful in accelerating the 
mean development of a group of 
nursery school children in repre- 
sentational drawing abilities during a 
period of about 6 months of special 
training. The children ranged in age 
from 2-0 to 4-8 years, with a mean of 
$-3 years. They were classified ac- 
cording to Monroe's (1929) “stages” 
of drawing development. Each child 
was stimulated to achieve beyond his 
present stage of skill in easel paint- 
ing, by means of open-ended ques- 
tions or statements of encourage- 
ment. Differences in favor of the ex- 
perimental group over their matched 
controls (who engaged in 6 months of 
undirected drawing) were significant 
at the 1-2% level. Interest of the 
children in drawing also increased 

significantly. 


WILLIAM FOWLER 


Results in this case are promising 
but the experiment is isolated and 
historical evidence limited. Without 
further research few conclusions can 
be drawn regarding the possibilities 
of early learning in this area. 
Writing 

Writing is closely related to draw- 
ing, in the sense of some of the per- 
ceptual-motor skills composing it. 
The chief difference lies in the ob- 
vious substitution of intricate verbal- 
symbol systems for the visual 
imagery and forms of drawing. 

As always, the empirical search for 
developmental age trends, based on 
maturational assumptions, is the pre- 
ferred investigational approach. The 
studies of Ames and Ilg (1951) and 
Hildreth (1936) illustrate this ap- 
proach. The developmental norms 
reported coincide with the age period 
when the child enters the formal 
School system. As might be ex- 
pected, the biggest jumps in the accu- 
rate production of letters and words 
generally occur during the sixth and 
especially the seventh years. 

Although writing was one of the 
three Rs in the early infant schools 
(Adamson, 1905; Brown, 1924; Rusk, 
1933), it may not have been given as 
much attention, since details are so 
little reported in historical sources. 
We know only of Stoner's (1914) 
daughter, who could write and spell 
simple sentences before she was 2, 
and the case of Francis Galton 
(Terman, 1917). Galton had learned 
to write simple prose by his fourth 
year, through the painstaking but 
gentle efforts of his older sister. | 

Montessori's (1912) success in 
teaching 4-year-olds to write in her 
nursery schools also needs reitera- 
tion, especially in the light of her de- 
tailed accounts on method (1912, 
1920). She first had the children 


LEARNING IN INFANCY AND CHILDHOOD 139 


finger-trace raised abstract-geometric 
forms which had been treated to 
make a rough, sandpapered surface. 
This was followed by the gradual in- 
troduction of similarly fabricated, 
raised letters. Montessori found that, 
within a few weeks, children as young 
as age 4 began to write, in her terms, 
"spontaneously." Progress was al- 
most invariably rapid from this 
point on. 

Montessori’s pioneer investiga- 
tions on early cognitive learning have 
fostered no continuing interest, un- 
fortunately. But there is some recent 
revival of interest reflected in the pre- 
liminary work of Moore and Ander- 
son (1960), cited earlier; also a study 
of this investigator (in progress) on a 
3-year-old learning to write. Assum- 
ing more extensive confirmation with 
other subjects, the relationship 
of these early childhood writing 
achievements to norms like those of 
Ames and Ilg (1951) and of Hildreth 
(1936) is clear. Established with 
little reference to prior training, these 
norms should probably be restricted 
in definition to one of cultural norms 
arising from prevailing practices. 


EARLY COGNITIVE STIMULATION 
AND PSYCHOSOCIAL 
DEVELOPMENT 


_ There is current a widespread be- 
lief that to undertake extensive 
cognitive stimulation with the infant 
and young child is to invite frustra- 
tion, learning inhibitions, and, in 
more extreme form, general personal- 
ity disturbances. Much of the foun- 
dation for this opinion is unquestion- 
ably rooted in concern over the emo- 
tional problems which are supposed 
to have been generated in tradition- 
ally rigid educational systems. There 
is also an ancient popular belief that 
abnormal development ("madness") 
is a concomitant of extreme mental 


brightness ( genius“) and early prod- 
igy. Scholars have, in the past, lent 
some credence to this notion (Lange- 
Eichbaum, 1931; Lombroso, 1891). 
The basis for this latter view, if not 
the view itself, has been seriously 
weakened in recent times. There is 
now a plethora of studies demon- 
strating that intellectually preco- 
cious children enjoy above average 
social adjustment, as well as intel- 
lectual achievement, throughout 
their life span (Cox, 1926; Davidson, 
1931; Hollingworth, 1926, 1942; 
Miles, 1954; Terman, 1918, 1925; 
Terman & Oden, 1940a, 1940b, 1947; 
Witty, 1930, 1940). 

Contemporary psychological theory 
is nevertheless inclined to regard 
emotional damage to the child as an 
inherent consequent of the process of 
systematic (if not all) early training 
in cognition. Developmental and 
psychodynamic theory especially de- 
fines the nature of the young child as 
rather fragile, autistic, and irra- 
tional—at the mercy of his emotional 
life. It is also believed that he lacks 
perceptual-cognitive structure, ob- 
jectivity, and basic concepts essen- 
tial for assimilating cognitive stimuli 
to any important degree. Many 
stress the importance of neural im- 
maturity, requiring that certain 
stages of structural “ripeness” be at- 
tained before complex categories of 
intellectual stimulation can be ab- 
sorbed safely. To some extent these 
beliefs are intrinsic to the widely held 
concept of “learning readiness,” al- 
though the notion of “timely” stimu- 
lation which should not come too late 
or too early is common (cf. Betts, 
1957; Monroe, 1951; Thompson, 
1952; Witty, 1946). The young 
organism also is often considered to 
be governed by immature, gross per- 
ceptual-motor modes, fitting him 
poorly for the fine perceptual-motor 
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focusing which accompanies many 
types of cognitive functioning. For 
example, premature demands for 
fine, visual discrimination (as in 
reading) upon undeveloped struc- 
tures are thought to make progress 
difficult and induce emotional and/or 
visual damage to the child (Gesell, 
1949; Gesell et al., 1949). 

It is also felt that extensive par- 
ticipation in social relations during 
the early years is essential for the 
most balanced personal develop- 
ment. Without quarreling with this 
last view, we believe that a few 
minutes to an hour a day spent with 
books and related intellectual stimu- 
lation is far from unbalancing. It 
may decidedly enrich development 
through augmenting knowledge and 
advancing environmental mastery. 

Because of the dangers imagined it 
is logical that specific tests of these 
hypotheses have been rare. In addi- 
tion, these fears have undoubtedly 
discouraged research on early cogni- 
tive learning per se, despite the fact 
that what little evidence can be 
mustered contradicts these theories, 
or is tenuous at best. 

In the first place, the excellent ad- 
justment and accomplishments which 
gifted children have realized (Cox, 
1926; Miles, 1954; Terman, 1918, 
1925; and many others), suggests the 
value of intensive early stimulation 
which apparently all gifted children 
receive. Furthermore, we have no 
assurance that this special stimula- 
tion may not itself be an indispen- 
sable agent in the formation of high 
abilities. 

Secondly, in studies which in- 
cluded medium or low IQ children 
under 5, who were given some form of 
lengthy cognitive training, there is 
no evidence the children suffered ill 
effects of any kind—including vision 
—(Brown, 1924; Davidson, 1931; 
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Dawe, 1942; Dubin, 1946; Gesell & 
Thompson, 1941; Jersild & Bien- 
stock, 1931; McGraw, 1939; Moore & 
Anderson, 1960; Thompson, 1944; 
Updegraft, Heiliger, & Learned, 1938; 
and others) While investigation 
never has centered on emotions, 
many of the studies included ma- 
terial relevant to this area (especially, 
Davidson, 1931; Gesell & Thomp- 
son, 1941; McGraw, 1939; Moore & 
Anderson, 1960; Thompson, 1944) 
Of the latter studies, all except 
Gesell and Thompson (1941) are in- 
clined to attribute many benefits, 
socioemotional as well as cognitive, 
directly or indirectly to the stimula- 
tion. 

Experiments on the nature-nuture 
issue, as we have observed, were 
rarely designed either to influence or 
to trace the kinds of experience en- 
countered during the early years. 
Only in nursery school programs has 
some effort been made along these 
lines and, here, perceptual motor and 
social skills have been found to pre- 
dominate. In the few studies that 
used specifically defined cognitive 
programs, the effect upon emotional 
adjustment and social learning ap- 
pears to have been highly favorable 
(Carr, 1938; Davidson, 1931; Peters 
& McElwee, 1944; Thompson, 1944). 
"Thompson (1944) furnished 11 child- 
ren with a cautious program of cogni- 
tive guidance in nursery school ac- 
tivities, compared with minimum 
guidance for the matched controls. 
IQs were not specified, but appar- 
ently were not far above average. At 
the end of one school year the experi- 
mental children scored significantly 
in the following traits of social-per- 
sonal development over their con- 
trols: ascendance, constructiveness 
(when faced with possible failure), 
social participation, and leadership 
ability. No change was found in the 
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proportion of “nervous habits," how- 
ever. Although both groups gained 
in IQ scores, the experimental group 
made slightly larger gains (difference, 
p<.10). This small difference may 
be ascribed to the fact that the type 
of activities were identical for both 
experimental and control children. 
More important, guidance techni- 
ques consisted only of increased 
personal attention, indirect methods, 
and some organizing of play ma- 
terials. Verbal guidance techniques 
were largely excluded. 

Peters and McElwee (1944) found, 
moreover, that their 6 socially de- 
prived 3-year-olds of low IQ ap- 
peared to benefit heavily in social and 
emotional functioning and suffer no 
harm from 8 months of organized 
cognitive stimulation. They gained 
42 points in social quotient on the 
Vineland Social Maturity Scale and 
increased on Merrill-Palmer Per- 
sonality ratings. Since controls were 
omitted, however, these results and 
other gains reported in  self-con- 
fidence and initiative in attacking 
new problems should be eyed with 
caution. 

Evidence of a more negative sort 
occasionally has marred this gen- 
erally satisfactory picture. It will be 
remembered that Welch (1940b) en- 
countered some motivational re- 
sistance when he endeavored to teach 
genus-species relations to 12-20 
month old children. It is difficult to 
draw conclusions, however, since 
Welch supplies only spotty details. 
As noted earlier, Welch’s methods 
may sometimes have been inflexible 
and he failed to separate experi- 
mental methods from subject matter. 
Moreover, his subjects often learned 
the relevant concepts. It may also be 
of consequence that Welch does not 
mention emotional difficulties as an 
important or recurrent problem in 
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any of his extensive experimental 
work with preschool children. 

There is, perhaps, only one other 
study directly concerned with the 
emotional problems connected with 
preschool cognitive learning (Fowler, 
in press). In Fowler's investigation of 
early reading with Velia, his 2-year- 
old daughter, he observed a number 
of signs of emotional maladjustment 
toward the end of the training. It 
was not possible to determine, how- 
ever, to what extent these may have 
been due to: (a) cumulative pressures 
which may have arisen from the read- 
ing stimulation per se; (6) a known, 
difficult, and concurrent nursery 
school conflict experience (the child 
was one of only two 2.5-year-old- 
children in a group of seven, active 
3- to 4-year-olds, all at least 7 
months older); or (c) other factors, 
such as faulty methods of teaching, 
singly or in combination with the 
first factors. It is noteworthy that 
the appearance of Velia's only im- 
portant and persisting emotional 
problems coincided to the week with 
nursery school attendance. The dif- 
ficulties followed 7 months of ade- 
quate and improving social adjust- 
ment and motivations for the train- 
ing. 

Velia has remained under observa- 
tion for the ensuing 5.5 years. Essen- 
tial recovery was disclosed (after 
some months) from earlier emotional 
troubles. Moderately good social 
adjustment has been maintained since 
that time. Interest and achievement 
in reading have been very high and 
the child also follows a broad pattern 
of cognitive and phsyical activities. 

It is also true that studies of bright 
children, who were highly stimulated 
at an early age, have uncovered a 
small proportion who experienced 
some disturbance in personality func- 
tioning. Cases have been concen- 
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trated in the extremely high IQ sub- 
group (e.g., Hollingworth, 1942). Sig- 
nificantly, problems have generally 
appeared to stem from the child's 
social role difficulties, arising from the 
manner in which society has tended 
to evaluate a child of this caliber— 
viz., "egghead"—(Hollingworth, 
1926, 1936, 1942; Miles, 1954; Root, 
1921; Terman, 1925; Terman & 
Oden, 1940a, 1940b, 1947; Witty, 
1930, 1940). Occasionally there is 
evidence that role adjustment prob- 
lems of this type may not have been 
the sole source of disturbance (e.g., 
Barlow, 1951; Dolbear, 1912; Hol- 
lingworth, 1926, 1936, 1942; Jordan, 
1933; Root, 1921). Post hoc case 
records have not proved decisive on 
these points, however. The concur- 
rent appearance of signs of faulty 
child rearing (parental ambivalence, 
overindulgence, etc.) or pressuring 
methods of training have precluded 
judgment of the effects of cognitive 
stimulation as such. 

Historically, as we have seen, the 
Severe conditions in early infant 
Schools probably brought extensive 
emotional problems. But where con- 
ditions were less rigorous, the socio- 
emotional consequences may even 
have been highly positive (cf. Rusk, 
1933). This contrast supplies some 
evidence that defective methods, 
rather than early stimulation, may 
endanger the child's emotions. 

Correlations between learning dif- 
ficulties and personality malfunc- 
tioning often have been reported in 
the field of educational research 
(Anderson & Dearborn, 1952; Gann, 
1945; Gates, 1941; Milner, 1951; 
Missildine, 1946; Robinson, 1946; 
Smith, 1955; and others). Findings 
have been complex and inconsistent, 
however, and the direction of ante- 

cedent-consequent relations often has 
remained uncertain. This has led 
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Smith (1955), among others to con- 
clude that reading problems, for ex- 
ample, and emotional maladjustment 
both arise from a common constella- 
tion of causes. While there is some 
evidence that coercive methods may 
be harmful (e.g., Missildine, 1946), 
the indictment is unclear with respect 
to the role of preschool stimulation. 
Actually, data from educational 
sources mostly are concerned with 
elementary school age children and 
are of little concern here. 

Certain experiments in discrimina- 
tion learning (Razran, 1933), i. e., 
where differences between CSs are 
gradually reduced below threshold, 
have produced emotional disturb- 
ances in animals. Other experiments, 
employing a frustration-aggression 
hypothesis (Barker, Dembo, & Lewin, 
1941; Dembo, 1931; Fajans, 1933), 
have resulted in temporary frustra- 
tion reactions in children. These 
studies may have a narrow range of 
application because in both types of 
experiments there were actually no 
solutions available for the problems 
as they were defined. Nevertheless, 
the children often became markedly 
frustrated when confronted with com- 
plex stimuli apparently open to solu- 
tion (the frustration-aggression hy- 
pothesis problems). This may be 
analogous to the problem the very 
young child faces when presented with 
complex cognitive stimuli above his 
comprehension level, but which the 
adult presents as if the child should 
be able to grasp them, On the other 
hand, is this problem essentially dif- 
ferent from the hazards of presenting 
stimulation at any age which is im- 
properly graded and paced for the 
child’s level? 

We wonder, in short, whether our 
modern anxiety over the damage 
childhood cognitive stimulation may 
induce is sufficiently anchored in 
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reality. In the probably well-founded 
criticisms of severe methods formerly 
(and perhaps even now occasionally) 
used, was there any reason to assail 
the process of early cognitive instruc- 
tion per se? Is it possible that re- 
formist zeal, fed liberally by the dis- 
coveries of Freud and others, may 
have resulted in methods and condi- 
tions being profoundly confused with 
subject matter, that is, cognitive ma- 
terial as such? Research to date, al- 
though requiring much extension, ap- 
pears to harmonize very well with 
such an interpretation. 


DISCUSSION AND SUMMARY 


This critical survey of selected 
literature on cognitive learning in 
early childhood has been made in 
order to gain perspective on certain 
crucial but widely neglected research 
problems. Neglect of these problems 
has resulted in a stress on genetic 
factors in early cognitive develop- 
ment. Correspondingly, this has been 
at the expense of visualizing the scope 
and complexity of learning realizable 
during the intial phases of develop- 
ment. 

Traditionally, there appear to have 
been two major viewpoints explain- 
Ing processes of development, and to 
account for the protracted course 
children go through to acquire com- 
plex knowledge. The first view fixes 
on concepts derived from biology. 
Under this view, intelligence and 
various abilities are postulated as in- 
herited. They emerge through a proc- 
ess of unfolding along a growth con- 
tinuum in several ordered stages of 
maturation. Modification of this in- 
nately determined patterning by 
Fae of environmental circum- 

‘ances is placed in secondary posi- 
tion, 

à In research, this view has led to a 
oncentration on empirical studies 


organized cross-sectionally for collect- 
ing normative data. For the most 
part, these norms have been linked 
sequentially to chronological age 
levels (e.g., MA or any age-trait be- 
havior). The use of longitudinal 
studies has been, logically, an impor- 
tant additional derivative of this 
view. But a maturationally oriented 
developmental framework has so 
dominated long range investigations 
made under its jurisdiction, as to 
cloud evidence which might have led 
to alternate conclusions. In addi- 
tion, there are other important re- 
search errors and orientations which 
have tended to feed this bias. These 
include the frequent omission of ade- 
quate controls; the experimental 
fusion of variables; an emphasis on 
sensory-motor and social-emotional 
development to the relative exclusion 
of complex, perceptual-cognitive proc- 
esses; and, above all, the simple 
omission of (antecedent) experiential 
data. 

An important group of better con- 
trolled studies has probed the matura- 
tion versus learning issue. These 
studies have, in the main, continued 
subordinate to maturationally ori- 
ented hypotheses; have failed to sep- 
arate general experience from matu- 
ration; and have too often designed 
training programs with only simple 
material and/or have covered too 
short a time span. These limitations 
have aborted opportunities for im- 
pressive gains to accrue in cognitive 
learning. f 

The second more or less opposing 
view, while according a greater role 
to experience, has generally been 
dominated by a behavioristic out- 
look. In its concern for dealing with 
operationally definable processes, this 
outlook has only recently begun to 
divest itself of a molecular focus which 
has relegated complex conceptual 
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processes to the second-class status 
of "intervening variables." This out- 
look also has generated a situational 
focus abstracted from the molar life 
situation, history, and developmental 
status of the organism. 

Preoccupation with precision and 
accurate measurement has induced 
many behavioral researchers to cling 
to high control, animal experimenta- 
tion of the laboratory. This has 
meant a comparative neglect of ex- 
perimentation on learning in children. 
The less frequent investigation of 
children has been much confined to 
correlational or cross-sectional study 
of a few isolated components of be- 
havior. Learning processes, where 
studied, have embraced problems 
whose solution depends only upon a 
few simple operations acquired over 
a restricted number of trials. Little 
attention has been directed toward 
longitudinal work on the antecedent- 
consequent relations of basic cogni- 
tive dimensions. 

The net effect of the two frame- 
works combined has been to retard 
the development of interest in cogni- 
tivelearningin young children. There 
have been, however, other serious 
brakes upon the growth of such an 
interest. Experimental work can be 
expensive. Longitudinal studies of 
cognitive learning processes are likely 
to demand heavy outlays of time, 
energy, and resources. The detailed 
and costly nature of the necessary 
planning, observations, and teaching 
energies is multiplied by the intricacy 
of processes involved in acquiring 
knowledge. Reading a language, 
number calculation, playing a musi- 
cal instrument, and conceptualizing 
physical causality are all processes 
organized in terms of complex sys- 
tems of interrelated verbal symbols 
and concepts. Learning such systems 
inevitably becomes a long-term, in- 
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creasingly complex, and cumulative 
process. Each series of steps is inte- 
grally related to and founded on 
lower orders of information. Asa 
result, progress demands that ma- 
terial be broken into simplified com- 
ponents and presented systematically, 
step-by-step in a long, shallow gradi- 
ent. 

Consider that even older children 
and adults are likely to require pro- 
longed time spans to make substan- 
tial headway in learning any com- 
plex cognitive system. Thus, the 
presumed inability of the preschool 
child to learn large blocks of knowl- 
edge may well be more apparent than 
real. Possibly it is more our failure 
to teach the child systematically in 
these areas than it is his inability to 
learn. The promising results in the 
few longitudinal training stuc:es of 
this kind support this conclusion. 

There are of course additional con- 
ditions which may contribute to the 
young child's difficulty with complex 
cognitive learning. Among these are 
characteristics which may inhere in 
the learning process itself during the 
initial stages of life. Something of the 
nature of this process has been ad- 
umbrated by learning theorists, not- 
ably, Hebb (1949) and Harlow (1949), 
and conceptualized in other ways by 
Piaget (1952). At birth, the infant, 
virtually lacking any knowledge at 
all, must spend an “apprentice” 
phase acquiring the most elementary 
foundation discriminations and gen- 
eralizations on the nature of the 
physical and social world. These pri- 
mary concepts are perhaps the most 
difficult and slowest to come by. This 
is because the infant possesses no 
general frames of reference to serve 
as guides or conceptual leverages for 
learning. Stated in another way, the 
neonate and child for some time is, 
essentially, learning the process of 
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how to learn; he is learning to 
learn" or is acquiring directional 
learning sets. In this manner, what 
to an adult are tiny and insignificant 
steps in learning various systems are, 
to the very young child, probably 
giant strides. 

The early years may hold special 
problems for acquiring knowledge. 
But unless efforts to explore cogni- 
tive learning in young children are 
greatly multiplied, we shall continue 
to know little about them. Fortu- 
nately, signs of interest in a percep- 
tual-cognitive framework, if not in 
cognitive learning, have been appear- 
ing more often recently in writings 
onchild development. Examples may 
be found in the texts of Thompson 
(1952) and, particularly, of Baldwin 
(1955). The new handbook of re- 
search methods (Mussen, 1960) labels 
an entire section "cognitive proces- 
ses. The ideas of a number of psy- 
chodynamic (e.g., Isaacs, 1945; Sul- 
livan, 1953), learning (Hebb, 1949; 
Harlow, 1949), and perceptual- cog- 
nitive theorists (e. g., Piaget, 1952; 
Solley & Murphy, 1960; Werner, 
1957) have also in some ways bridged 
traditional gaps existing among de- 
velopmental, learning, perceptual, 
Cognitive, and motivational theories 
of behavior. However, much remains 
to be elaborated. More important, 
to date little of this theory construc- 
tion seems to have stimulated cor- 
Tesponding research activity on long- 
term cognitive learning in child de- 
velopment. 
nh if not most of the energy in 
is psychology and development in 
ibe Seas has been concentrated on 
ass ild's personality, perceptual 
"Hi and socioemotional function- 
Cre development. Originating 
1930 Duy as a reaction to historically 
i oe and stringent methods, 

ave generalized to encompass 
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early cognitive learning per se as 
intrinsically hazardous to develop- 
ment. As legitimate areas of study, 
the contributions of studies on per- 
ceptual-motor and socioemotional 
problems are obvious. But in the 
field of child guidance, interest in 
these areas has come to permeate and 
dominate work in child development 
almost to the exclusion of work on 
cognitive learning. In harking con- 
stantly to the dangers of premature 
cognitive training, the image of the 
“happy,” socially adjusted child has 
tended to expunge the image of the 
thoughtful and intellectually edu- 
cated child. Inevitably, in this atmos- 
phere, research (and education) in 
cognition has lagged badly, especially 
since the 1930s, not only for the 
early years of childhood but for all 
ages. 

Even prior to the more recent era, 
however, very little careful research 
was done on early cognitive learning. 
As historical evidence shows, most 
studies have comprised the work of 
those beyond the pale" of formal 
psychology. Yet, taken collectively, 
the findings are so provocative as to 
make us entertain hopes that many, 
if not all, children can and indeed 
should be offered much more cogni- 
tive stimulation than they have been 
generally receiving. 

There is, however, a further prob- 
lem, at once a derivative of and an 
important contributor to the failure 
to undertake work on cognitive 
learning. Few systematic methods 
have been devised for educating 
young children, especially in compli- 
cated subject matter. We have in 
mind methods for simplifying and 
organizing the presentation of cogni- 
tive stimuli. Equally important, 
methods must be sufficiently flexible 
and play oriented to be adaptable to 
the primary learning levels and per- 
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sonality organization characteristic 
of the infant and young child. 

The advantages of utilizing the 
now relatively untapped “preschool”’ 
years for cognitive education are of 
course manifest. Most obvious, is the 
availability of more years of child- 
hood to absorb the increasingly com- 
plex technology of modern society, a 
technology already requiring many 
of the more productive years of de- 
velopment to acquire. A second is the 
less evident but more crucial possi- 
bility that conceptual learning sets, 
habit patterns, and interest areas, 
may well be more favorably estab- 
lished at early than at later stages of 
the developmental cycle. Moreover, 
there may be problems inherent in 
allowing long-term sets and the ac- 
cumulation of knowledge in one area 
to take precedent over learning in 
another. Conceivably, to establish 
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prior sets in one style (e.g., impulsive, 
concrete, or gross motor orientations) 
may even strongly inhibit later at- 
tempts to learn in other directions 
(e.g., problem solving, abstract, or 
verbal orientations). As a minimum, 
cognitive orientations might pref- 
erably be established early in de- 
velopment, concomitant with other 
approaches. 

What of norms? Will research and 
educational programs of this kind 
eliminate the need for norms? The 
answer depends upon what circum- 
stances are taken into account. Yes, 
if we are thinking of norms for cogni- 
tive development as presently :neas- 
ured with little relation to dif'erences 
in children's life histories. No, if new 
ones are established or present ones 
modified on the basis of serious probes 
of the cumulative limits of cognitive 
learning during the early years. 


REFERENCES 


ADAMSON, J. W. Pioneers of modern educa- 
tion: 1600-1700. London: Cambridge 
Univer. Press, 1905. 

ADAMSON, J. W. English education. London: 
Cambridge Univer. Press, 1930. 

ALPERT, AUGUSTA. The solving of problem- 
situations by preschool children. Teachers 
Coll. Contr. Educ., 1928, No. 323. 

Ames, Louise B., & Irc, Frances L. Devel- 
opmental trends in writing behavior. J. 
genet. Psychol., 1951, 79, 29-46, 

Ames, Louise B., & LEARNED, JANET. The 
development of verbalized space in the 
young child. J. genet. Psychol., 1948, 72, 
63-64, 

ANASTASI, ANNE. Psychological testing. New 
York: Macmillan, 1954. 

ANASTASI, ANNE. Differential psychology. 
(3rd ed.) New York: Macmillan, 1958. 
ANDERSON, I. H., & DEARBORN, W. F. The 
psychology of teaching reading. New York: 

Ronald, 1952. 

ANDERSON, J. E. The limitations of infant 
and preschool tests in the measurement of 
intelligence. J. Psychol., 1939, 8, 351-379. 

ANDERSON, L. D. The predictive value of in- 
fancy tests in relation to intelligence at five 
years. Child Develpm., 1939, 10, 103-212. 

BALDWIN, A. L. Behavior and development in 


childhood. New York: Dryden, 1955. 

BARBE, W. A study of the reading of gifted 
high school children. Educ. Admin. Superv., 
1952, 38, 148-154. 

BARKER, R., DEMBO, TAMARA, & LEWIN, K. 
Studies in topological and vector psychol- 
ogy: II. Frustration and regression. U. Ia. 
Stud. child Welf., 1941, 18, No. 1. 

Bartow, F. Mental prodigies. 
Hutchinson, 1951. 

BARRETT, HELEN E., & Koch, HELEN L. The 
effect of nursery school training upon men- 
tal test performance of a group of orphan- 
age children. J. genet. Psychol., 1930, 37, 
102-122. 

BavlxEx, Nancy. The development of motor 
abilities during the first three years. 
M Soc. Res. Child Develpm., 1935, 

o. 1. 

BavLkV, Nancy. Review of Merrill-Palmer 
Scale of Mental Tests. In O. K. Buros 
(Ed.), The nineteen forty mental measure- 
ments yearbook. Highland Park, N. I.: 
Mental Measurements Yearbook, 1941. 
P. 1406. 

Baytey, Nancy. Consistency and variability 
in the growth of intelligence from birth to 
eighteen years. J. genet. Psychol., 1949, 
75, 165-196. 


London: 


LEARNING IN INFANCY AND CHILDHOOD 147 


Bayley, Nancy. On the growth of intelli- 
gence. Amer. Psychologist, 1955, 10, 805- 
818. 

BELL, E. T. Men of mathematics. New York: 
Simon & Schuster, 1937. 

Berts, E. A. Foundations of reading instruc- 
tion. New York: American Book, 1957. 
BigcueNouGH, C. History of elementary edu- 
calion in England and Wales from 1800 to 

the present day. London: Clive, 1920. 

BowERMAN, W. G. Studies im genius. New 
York: Philosophical Library, 1947. 

Braine, M. D. S. The ontogeny of certain 
logical operations: Piaget's formulation ex- 
amined by nonverbal methods. Psychol. 
Monogr., 1959, 73(5, Whole No. 475). 

Brown, Murtet W. A study of reading abil- 
ity in preschool children. Unpublished 
master's thesis, Stanford University, 1924. 

Burks, BARBARA S. The relative influence of 
nature and nurture upon mental develop- 
ment: A comparative study of foster parent- 
foster child resemblance and true parent- 
true child resemblance. Yearb. Nat. Soc. 
Stud. Educ., 1928, 27, Part 1, 219-316. 

Burtt, H. E. An experimental study of early 
childhood memory. J. genet. Psychol., 
1932, 40, 287-295. 

Burrt, H. E. A further study of early child- 
hood memory. J. genet. Psychol., 1937, 50, 
187-192. 

Burtt, H. E. An experimental study of early 
childhood memory: Final report. J. genet. 
Psychol., 1941, 58, 435-439. 

Carr, V. S. The social and emotional 
changes in a group of children of high intel- 
ligence during a program of increased edu- 
cational stimulation. Unpublished mas- 
ter's thesis, University of Iowa, 1938. 

Sadi i P & Irwin, O. C. Infant speech 
owel and consonant s. J. speech Dis. 
Pig n yr co Ne d 1 

Corsy, Martua G. Instrumental reproduc- 

tion of melody by preschool children. J. 
genet. Psychol., 1935, 47, 413-430. 
OX, CATHERINE M. Genetic studies of genius. 
Vol. 2. The early mental traits of three hun- 
dred geniuses. Stanford: Stanford Univer. 
Press, 1926, 

~AVIDSON, HELEN P. An experimental study 
8 bright, average, and dull children at the 
90 mental level. Genet. psychol. 

D onogr., 1931, 9, 119-289. 
AWE, HELEN C. A study of the effect of an 
educational program upon language devel- 
opment and related mental functions in 
11 irent J. exp. Educ., 1942, 11, 

1 Tamara. Der Arger als dynamisches 

roblem. Psychol. Forsch., 1931, 15, 1-144. 


Dennis, W., & DENNIS, SENA G. Develop- 
ment under controlled environmental con- 
ditions. In W. Dennis (Ed.), Readings in 
child psychology. New York: Prentice- 
Hall, 1951. Pp. 104-131. 

DrscoEupmEs, ALICE. Le développement de 
l'enfant de deux à sept ams. (3rd ed.) 
Neuchatel: Delachaux & Niestle, 1946. 

DEUTSCHE, JEAN M. The development of chil- 
dren's concepts of causal relations. Minne- 
apolis: Univer. Minnesota Press, 1937. 

DoLBEAR, KATHERINE E. Precocious chil- 
dren. Pedag. Seminary, 1912, 19, 461-491. 

Dolch, E. W. Methods in reading. Cham- 
paign, III.: Garrard, 1955. 

Dustin, ELISABETH R. The effect of training 
on the tempo of development of graphic 
representation in preschool children. J. 
exp. Educ., 1946, 15, 166-173. 

FaJANs, Sara. Die Bedeutung der Entfer- 
nung fiir die Stürke eines Aufforderungs- 
charaketers beim Saügling und Kleinkind. 
Psychol. Forsch., 1933, 17, 215-267. 

FISHER, M. S. Language patterns of pre- 
school children. Child develpm. Monogr., 
1934, No. 15. 

Forest, ILSE. Preschool education: A his 
torical and critical study. New York: Mac 
millan, 1927. 

FOSTER, JOSEPHINE C. Verbal memory in the 
preschool child. J. genet. Psychol., 1928, 
35, 26-44. 

Fow ter, W. Teaching a two-year-old to read: 
An experiment in early childhood learning. 
Genel. psychol Monogr., in press. 

FREEMAN, F. N., HOLZINGER, K. J., & MIT- 
CHELL, B. C. The influence of environment 
on the intelligence, school achievement, and 
conduct of foster children. Yearb. Nat. Soc. 
Stud. Educ., 1928, 27, Part 1, 103-217. 

Gann, Eprrn. Reading difficulty and per 
sonality organization. New York: King’s 
Crown, e ^ fag fé 
ATES, A. I. e necessary men 

E beginning reading. Elem. sch. J., 1937, 37, 
497—508. 7 

Gates, A. I. The role of personality malad- 
justment in reading disability. J. genet. 
Psychol., 1941, 59, 71-83. 

Gates, A. I., & TAYLOR, GRACE A. An ex- 
perimental study of the nature of n 

ent resulting from practice in a me 
poni J. educ. Psychol., 1925, 16, 583- 
592. b 
ESTHER A. A survey of musical talent 

pouce Psychol. Monogr., 1922, 31(1, 
Whole No. 140), 128-156. 

Gesett, A. The developmental aspect of 
vision. J. Pediat., 1949, 35, 310-316. 

GESELL, A., & AMES, LOUISE, B. The devel- 


148 


opment of directionality in drawing. J. 
genet. Psychol., 1946, 68, 45-61. 

GESELL, A., HALVERSON, H. M., THOMPSON, 
HELEN, ILG, Frances L., Castner, B. M., 
Ames, Louise B., & AMATRUDA, CATHER- 
INE S. The first five years of life: A guide to 
the study of the preschool child. New York: 
Harper, 1940. 

GESELL, A., & ILG, Frances L. Child devel- 
opment: An introduction to the study of hu- 
man growth. New York: Harper, 1949. . 

GESELL, A., Irc, Frances L., & Buris, 
GLENNA E. Vision: Its development in in- 
fant and child. New York: Hoeber, 1949, 

GESELL, A., & THOMPSON, HELEN. Learning 
and growth in identical infant twins. Genet. 
Psychol. Monogr., 1929, 6, 1-124. 

GESELL, A., & THompson, HELEN. Twins T 
and C from infancy to adolescence: A bio- 
genetic study of individual differences by 
the method of co-twin control. Genet. psy- 
chol. Monogr., 1941, 24, 3-121. 

Gipson, ELEANOR J. Improvement in percep- 
tual judgments as a function of controlled 
practice or training. Psychol. Bull., 1953, 
50, 401-431. 

Gipson, J. J., & GIBSON, ELEANOR J. Per- 
ceptual learning: Differentiation or enrich- 
ment? Psychol. Rev., 1955, 62, 32-41. 

Giltay, M. Sur l'apparition et le développe- 
ment de la notion du nombre chez l'enfant 
de deux à sept ans. J. Psychol. norm. path., 
1936, 33, 673-695. 

GOODENOUGH, FLORENCE L. Measurement of 
intelligence by drawings. Yonkers: World 
Book, 1926. 

Grirritas, C. R. A general introduction to 
psychology. New York: Macmillan, 1923. 

HaRDTNC, L. W. Arithmetic for child develop- 
ment. Dubuque, Ia.: Wm. C. Brown, 1959. 

HanLow, H. F. The formation of learning 
sets. Psychol. Rev., 1949, 56, 51-65. 

Hess, D. O. The organization of behavior. 
New York: Wiley, 1949, 

Hicks, J. A. The acquisition of motor skill in 
young children. Child Develpm., 1930, 1, 
90-105. (a) 

Hicks, J. A. The acquisition of motor skill 
in young children: II. The influence of 
specific and of general practice on motor 
skill. Child Develpm., 1930, 1, 292-297. (b) 

Hicks, J. A., & RALPH, Donorgy W. The 
effects of practice in tracing the Porteus 
diamond maze. Child Develpm., 1931, 2, 
156-158. 

HILDRETH, GERTRUDE H. Development of 
sequences in name writing. Child Develpm., 
1936, 7, 291-303. 

HILGARD, JOSEPHINE R. Learning and mat- 


WILLIAM FOWLER 


uration in preschool children. J. 
Psychol., 1932, 41, 31-56. 

HiLGARD, JosEPHINE R. The effect of early 
and delayed practice on memory and motor 
performances studied by the method of co- 
twin control. Genet. psychol. Monogr., 1933, 
14, 493-567. 

HISSEN, IRENE. A new approach to music for 
young children. Child Develpm., 1933, 4, 
308-317. 

HOLLINGWORTH, Leta S. Gifted children: 
Their nature and nurture. New York: Mac- 
millan, 1926, 

HOLLINGWORTH, Leta S. The Terman classes 
at Public School 500. J. educ. Sociol., 1936, 
10, 86-90. 

HOLLINGWORTH, Leta S. Children above 180 
IQ. New York: World Book, 1942. 

Honzic, MARJORIE P., MACFARLANE, JEAN 
W., & ALLEN, Lucite. The stability of 
mental test performance between two and 
eighteen years. J. exp. Educ., 1948, 17, 
309-324 

HuaNG, I. Children's conception of physical 
causality: A critical summary. J. genet. 
Psychol., 1943, 63, 71-121. 

Irc, Frances L., & Ames, LoursE B. Devel- 
opmental trends in reading behavior. J. 
genet. Psychol., 1950, 76, 291-312. 

Irc, Frances L., & Ames, Louise B. Devel- 
opmental trends in arithmetic. J. genet. 
Psychol., 1951, 79, 3-28. 

Isaacs, Susan. Intellectual growth in young 
children. London: Routledge, 1945. 

Jarra, ADELE S. The California Preschool 
Mental Scale: Form A. Berkeley: Univer. 
California Press, 1934. 

Jaxossson, S. Report on two prodigy mental 
arithmeticians, Acta med. Scand., 1944, 119, 
180-191. 

JERsiLD, A. T., & BieNsTOCK, SvrviA F. The 
influence of training on the vocal ability of 
three-year-old children. Child Develpm., 
1931, 2, 272-291. 

Inks TLD, A. T., & BIENSTOCk, SyLVIA F. A 
study of the development of children's 
ability to sing. J. educ. Psychol., 1934, 25, 
481-503. 

Jones, Alicꝝ M. The superior child: A series 
of case studies. Psychol. Clin., 1923, 18, 
1-8, 116-123, 130-137. 

Jones, H. E. The conditioning of overt emo- 
tional responses. J. educ. Psychol., 1931, 
22, 127-130. 

Jones, H. E. The environment and mental 
development. In L. Carmichael (Ed.) 
Manual of child psychology. (2nd ed.) New 
York: Wiley, 1954. Pp. 631-696. 

Jorpan, A. M. Educational psychology. (Rev. 
ed.) New York: Henry Holt, 1933. 


genet. 


LEARNING IN INFANCY AND CHILDHOOD 149 


KLATSKIN, ETHELYN H. Intelligence test per- 
formance at one year among infants raised 
with flexible methodology. J. clim. Psychol., 
1952, 8, 230-237. 

LaxcE-EicHBAUM, W. The problem of genius. 
London: Kegan Paul, 1931. 

Leany, ALICE M. Nature-nurture and in- 
telligence. Genet. psychol. Monogr., 1935, 
17, 236-308. 

Lewis, M. M. 
beginnings of language. 
court, Brace, 1936. 

Linc, BING-CHñUNG. Form discrimination as a 
learning cue in infants. Comp. psychol. 
Monogr., 1941, 17(2, Whole No. 86). 

LowBRoso, C. The man of genius. London: 
Scott, 1891. 

Lone, L., & WELCH, L. The development of 
the ability to discriminate and match num- 
bers. J. genet. Psychol., 1941, 59, 377-387. 

Maazet, M. What to do about the child 
prodigy. Etude, 1950, 68, 12-13, 60-61. 

McCaxprEss, B. R. The effect of enriched 
educational experiences upon the growth of 
intelligence of very superior children. Un- 
published master’s thesis, University of 
Towa, 1940. 

McCartuy, DOROTHEA. The language devel- 
opment of the preschool child. Minneapolis: 
Univer. Minnesota Press, 1930. 

McCartuy, DOROTHEA. Language develop- 
ment in children. In L. Carmichael (Ed.), 
Manual of child psychology. (2nd ed.) New 
York: Wiley, 1954. Pp. 374—458. 

McGraw, MykrEE B. Growth: A study of 
Johnny and Jimmy. New York: Appleton- 
Century, 1935. 

McGraw, MynrLE B. Later development of 
Children specially trained during infancy: 
Johnny and Jimmy at school age. Child 
Developm., 1939, 10, 1-19. 

Maninesco, G., & KREINDLER, A. Des ré- 
flexes conditionnels: I. L'organization des 
réflexes conditionnels chez l'enfant. J. 
Psychol., Geneve, 1933, 30, 855-886. 

Maturson, Eunice. A study of problem solv- 
2 behavior in preschool children. Child 

x evelopm., 1931, 2, 242-262. 

1 Marton L. The relation between 

e complexity of the habit to be acquired 
iie form of the learning curve in young 
13, Fae TA Psychol. Monogr., 1933, 

Miis, C. E., & Dincman, H. F. The struc- 

ed aire at the preschool ages: Hy- 
126 i 

87, „ Psychol. Bull., 1960, 

0 5 CarHERINE C. Gifted children. In 

* Carmichael (Ed.), Manual of child psy- 


Infant speech: A study of the 
New York: Har- 


chology. New York: Wiley, 1954. Pp. 984- 
1063. 

MLR, N. E., & DOLLARD, J. Social learning 
and imitation. New Haven: Yale Univer. 
Press, 1951. 

MILNER, ESTHER. A study of the relationship 
between reading readiness in grade one 
school children and patterns of parent-child 
interaction. Child Develpm., 1951, 22, 
95-112. 

MissitpiNE, W. H. The emotional back- 
ground of thirty children with reading dis- 
abilities with emphasis on its coercive ele- 
ments. Nerv. Child., 1946, 5, 263-272. 

MrircgELL, F. D. Mathematical prodigies. 
Amer. J. Psychol., 1907, 18, 61-143. 

Moxkok, Marion. The drawings and color 
preferences of young children. Unpublished 
doctoral dissertation, University of Chicago, 
1929, 

Monroe, Marton. Growing into reading. Chi- 
cago: Scott, Foreman, 1951. 

Monressort, Maria. The Montessori method. 
New York: Frederick A. Stokes, 1912. 

Montessori, Maria. Dr. Montessori's own 
handbook. London: William Heinemann, 
1920. 

Mook, O. K., & ANDERSON, A. R. Early 
reading and writing (motion picture): I. 
Skills. II. Teaching Methods. III. Develop- 
ment. Guilford, Conn.: Basic Education, 
1960. 

MonrHETT, MABEL V., & WASHBURNE, G: 
When should children begin to read? 
Elem. sch. J., 1931, 31, 496-503. 

Munn, N. L. Learning in children. In Car- 
michael (Ed.), Manual of child psychology. 
(2nd ed.) New York: Wiley, 1954. Pp. 374- 
458. 

Mussen, P, H. Handbook of research methods 
in child development. New York: Wiley, 
1960. 

Newman, H. H., FREEMAN, F. N., & Hot- 
ZINGER, K. J. Twins: A study of heredity 
and environment. Chicago: Univer. Chicago 
Press, 1937. $ 

Oaxes, M. E. Children's explanations of 
natural phenomena. Teachers Coll. Contr. 
Educ., 1946, No. 926. 

Oscoop, C. E. Method 
mental psychology. 
Univer. Press, 1953. 

Parker, S. C. History of modern. elementary 
education. New York: Ginn, 1912. s 

Perters, C. C., & McELWEE, A. R. Improving 
functioning intelligence by analytical train- 
ing in a nursery school. Elem. sch. J., 1944, 

5, 213-219. 

Beye J. The language and thought of the 

child. New York: Harcourt, Brace, 1926. 


and theory in experi- 
New York: Oxford 


150 


PIAGET, J. Judgment and reasoning in the 
child. New York: Harcourt, Brace, 1928. 

PIAGET, J. The child's conception of physical 
causality, New York: Harcourt, Brace, 
1930. 

PIAGET, J. The origins of intelligence in chil- 
dren. New York: International Univer. 
Press, 1952. 

PIAGET, J. De la logique concrete de l'enfant 
d la logique propositionelle de l'adolescent. 
Paris: Presses Universitaires, 1955. 

PIAGET, J., & INHELDER, BÄRBEL. The child's 
conception of space. London: Routledge & 
Kegan Paul, 1956. 

PIAGET, J., & SzeminsKa, ALINA. The child's 
conception of number. London: Routledge 
& Kegan Paul, 1952. 

Raymont, T. A history of the education of 
young children. London: Longmans, 
Green, 1937. 

Razran, G. H. S. Conditioned responses in 
children. Arch. Psychol., N. Y., 1933, No. 
148. 

REISNER, E. H. The evolution of the common 
school. New York: Macmillan, 1930. 

Révész, G. The indivisibility of mathemati- 
cal talent. Acta psychol., Amst., 1940, 5, 
1-21. 

RicHARDSON, HELEN M. The growth of adap- 
tive behavior in infants: An experimental 
study of seven age levels. Genet. psychol. 
Monogr., 1932, 12, 195-359. 

RicHARDsON, HELEN M. The adaptive be- 
havior of infants in the utilization of the 
lever as a tool: A developmental and ex- 
perimental study. J. genet. Psychol., 
1934, 44, 352-377. 

Rire, D. C., & Snyper, L. H. Studies in 
human inheritance. Hum. Biol., 1931, 3, 
547-559. 

Ross, D. M., & Garrison, J. J. Art in the 
western world. New York: Harper, 1935. 

ROBERTS, KATHERINE E. The ability of pre- 
school children to solve problems in which 
a simple principle of relationship is kept 
constant. J. genet. Psychol., 1932, 40, 118- 
135; 

Rosinson, HELEN M. Why pupils fail in 
reading. Chicago: Univer. Chicago Press, 
1946. 

Roor, W. T. A socio-psychological study of 
fifty-three supernormal children. Psychol. 
Monogr., 1921, 29(4, Whole No. 133). 

Rusk, R. R. A history of infant education. 
London: Univer. London Press, 1933. 

RüssEL, A. Über Formauffassung zwei- bis 
fünfjahriger Kinder. Neue psychol. Stud., 
1931, 7, 1-108. (See Wohlwill, 1960) 

RussELL, R. W. The development of ani- 

mism. J. genet. Psychol., 1940, 56, 353-366. 


WILLIAM FOWLER 


Russet, R. W., & Dennis, W. Studies in 
animism: I. A standardized procedure for 
the investigation of animism. J. g 
Psychol., 1939, 55, 389-400. 

SALMON, D., & Hinps#aw, WINIFRED, Infant 
schools. Longmans, Green, 1904. 

Sarton, G. A history of science. Vols. 1 and 
2. Cambridge: Harvard Univer. Pre: 
1952, 1959. ‘ 

Scneerer, M., RorHMAN, E., & GOLDSTEIN, 
K. A case of "idiot savant": An experi- 
mental study of personality organization, 
Psychol. Monogr., 1945, 58(4, Whole No. 
269). 

Scripture, E. W. Arithmetical 
Amer. J. Psychol., 1891, 4, 1-59. 

SuimtEY, Mary M. The first two years: 
study of twenty-five babies, Vol. 1. Postural 
and locomotor development. Minneapolis; 
Univer. Minnesota Press, 1931. 

SHIRLEY, Mary M. The first two years: A 
study of twenty-five babies. Vol. 2. Intellec- 
tual development. Minneapolis: Univer. 
Minnesota Press, 1933. 

Sipis, Boxis. Philistine and genius. Boston: 
Moffat, Yard, 1911. 

Sxopak, Marie. Children in foster homes: A 
study of mental development. U. Ia. Stud. 
child Welf., 1939, 16, No. 1. 

SKODAK, Marie, & Skeets, H. M. A final 
follow-up study of one hundred adopted 
children. J. genet. Psychol., 1949, 75, 85- 
125. | 

SmiTH, FRANK. A history of English elemen- 
tary education: 1760-1902. London: 
Univer. London Press, 1931. 

Surrn, Maponan E. An investigation of the 
development of the sentence and the extent 
of vocabulary in young children. U. Ia. 
Stud. child Welf., 1926, 3, No. 5. 

SmitH, Maporau E. A study of some factors 
influencing the development of the sentence 
in preschool children. J. genet. Psychol, 
1935, 46, 182-212. ( 

Situ, NIA B. Research on reading and the 
emotions. Sch. Soc., 1955, 81, 8-10. 4 

SoBEL, B. A study of the development of 
insight in pre-school children. J. gend. 
Psychol., 1939, 55, 381-388. i 

Sottey, C. M., & Murray, G. M. Develop- 
ment of the perceptual world. New York: 
Basic Books, 1960. 3 

Stanton, HazrL M. The inheritance of - 
specific musical capacities. Psychol. - 
1 1922, 31 (1, Whole No. 140), 157- 


prodigies. 


Stoner, WINIFRED S. Natural education. 
Indianapolis: Bobbs-Merrill, 1914. / 
Stoner, Winirrep S. Manual of natural edu- 
cation. Indianapolis: Bobbs-Merrill, 1916. 


LEARNING IN INFANCY AND CHILDHOOD 


Srow, D. The training system of education. 
(lich ed.) London: Longman, Green, 
Longman, & Roberts, 1859. 

SrRAYER, Lors C. Language and growth: The 
relative efficacy of early and deferred vo- 
cabulary training, studied by the method 
of co-twin control. Genet. psychol. Monogr., 
1930, 8, 215-317. 

SuLLIVAN, H. S. The interpersonal theory of 
psychiatry. (Ed. by Helen S. Perry & Mary 
L. Gawel) New York: Norton, 1953. 

TERMAN, L. M. The intelligence quotient of 
Francis Galton in childhood. Amer. J. 
Psychol., 1917, 28, 208-215. 

Terman, L. M. An experiment in infant 
1 J. appl. Psychol., 1918, 2, 219- 
28. 

Terman, L. M. The possibilities and limita- 
tions of training. J. educ. Res., 1924, 10, 
335-343. 

TERMAN, L. M. Genetic studies of genius. 
Vol.1. Mental and physical traits of a thou- 
sand gifted children. Stanford: Stanford 
Univer. Press, 1925. 

Terman, L. M., & MERRILL, MAUD A. 
Measuring intelligence: A guide to the ad- 
ministration of the new revised Stanford- 
Binet tests of intelligence. Boston: Hough- 
ton Mifflin, 1937. 

TERMAN, L. M., & Open, MELITTA. The sig- 
nificance of deviates: II. Status of the 
California gifted group at the end of sixteen 
years. Yearb. Nat. Soc. Stud. Educ., 1940, 
39, Part 1, 67-74. (a) 

Terman, L. M., & Onen, MzLrrA. The 
significance of deviates: III. Correlates of 
adult achievement in the California gifted 
group. Yearb. Nat. Soc. Stud. Educ., 
1940, 39, Part 1, 74-89. (b) 

TERMAN, L. M., & Open, MELITA. The gifted 
child grows up. Stanford: Stanford Univer. 
Press, 1947. 

Tnoursox, G. G. The social and emotional 
development of preschool children under 
two types of educational program. Psychol. 
Monogr., 1944, 56(5, Whole No. 258). 
HOMPSON, G. G. Child psychology. Boston: 

T Houghton Mifflin, 1952. 

REDGOLD, R. F., & Soppy, K. A textbook of 
. deficiency. (9th ed.) Baltimore: 

v illiams & Wilkins, 1956. 

PDEGRAFF, RUTH, HEILIGER, L., & LEARNED, 
un The effects of training upon the 
S ging ability and musical interests of 
Ec four, and five-year-old children. 
2 ris Stud. child Welf., 1938, 14, 83-131. 
d 5 DonorHv. The environment 
ea mee. vear- old children: Factors related 
x intelligence and vocabulary tests. 

eachers Coll. Contr. Educ., 1929, No. 366. 


151 


ViNACKE, W. E. The investigation of concept 
formation. Psychol. Bull., 1951, 48, 1-31. 

WARD, AGNES. Some aspects of theory and 
practice in infant education. In R. D. 
Roberts (Ed.), Education in the nineteenth 
century. Cambridge, England: Univer. 
Press, 1901. Pp. 15-33. 

WASHBURNE, C. When should we teach 
artithmetic? A committee of seven investi- 
gation. Elem. sch. J., 1928, 28, 659-665. 

Wetcu, L. The development of discrimina- 
tion of form and area. J. Psychol., 1939, 7, 
37-54. (a) 

WELCH, L. The development of size dis- 
crimination between the ages of 12 and 40 
months. J. genet. Psychol., 1939, 55, 243- 
268. (b) 

WEtcu, L. The span of generalization below 
the two-year age level. J. genet. Psychol. 
1939, 55, 269-297. (c) 

Wetcu, L. The genetic development of the 
associational structures of abstract think- 
ing. J. genet. Psychol., 1940, 56, 175-206. (a) 

Wetcn, IL. A preliminary investigation of 
some aspects of the hierarchial development 
of concepts. J. genet. Psychol., 1940, 22, 
359-378. (b) 

We cn, L. Recombination of ideas in crea- 
tive thinking. J. appl. Psychol., 1946, 30, 
638-643. 

We cu, L. A behavioristic explanation of 
concept formation. J. genet. Psychol., 1947, 
71, 201-222. (a) 

Wetcu, L. The transition from simple to 
complex forms of learning. J. genet. Psy- 
chol., 1947, 71, 223-251. (b) 

WELCH, L. An integration of some funda- 
mental principles of modern behaviorism 
and gestalt psychology. J. genet. Psychol., 
1948, 39, 175-190. 

We cn, L., & Davis, H. I. ] 
abstractions and its application. 
1935, 15, 138-145. 

WIC, L., & Lone, L. The higher structural 
phases of concept formation of children. J. 
Psychol., 1940, 9, 59-95. 

WELLMAN, BETH L. IQ changes of preschool 
and nonpreschool groups during the pre- 
school years: A summary of the literature. 
J. Psychol., 1945, 20, 347-368. 

WELLMAN, BETH L., & MCcCANDLESS, B. R. 
Factors associated with Binet IQ changes 
of preschool children. Psychol. Monogr., 
1946, 60(2, Whole No. 278). 

WERNER, H. Comparative psychology of mental 
development. (Rev. ed.) New York: Inter- 
national Univer. Press, 1957. 

WmrprLE, G. M. (Ed.) Nature and nurture. 
Yearb. Nat. Soc. Stud. Educ., 1928, 27, 


Parts 1 & 2. 


The theory of 
Psyche, 


152 


WniPPLE, G. M. (Ed.) Intelligence: Its na- 
ture and nurture. Yearb. Nat. Soc. Stud. 
Educ., 1940, 39, Parts 1 & 2. 

Wiener, N. Ex-prodigy: My childhood and 
youth. New York: Simon & Schuster, 1953. 

WirLiAMS, H. M. Some problems of sampling 
in vocabulary tests. J. exp. Educ., 1932, 1, 
131-133. 

WirLtAMS, RATH M., & Mattson, Marion L. 
The effect of social groupings upon the 
language of preschool children. Child De- 
velpm., 1942, 13, 233-248. 

WITTE, K. The education of Karl Witte. (Ed. 
by H. A. Bruce & trans. by L. Wiener) 
New York: Crowell, 1914. 

Wirrv, P. A. A study of one hundred gifted 
children. U. Kans. Bull. Educ., 1930, 1, 
No. 13. 

Witty, P. A. A genetic study of fifty gifted 


WILLIAM FOWLER 


children. Yearb. Nat. Soc. Stud. Educ., 
1940, 39, Part 2, 401-408. 

Wirrv, P. A. A modern interpretation of 
readiness for reading. Educ. Admin. Su- 
perv., 1946, 32, 257-270. 

WonrwiLL, J. F. Developmental studies of 
perception. Psychol. Bull., 1960, 57, 249- 
288. 

WoopwonrH, R. S. Heredity and environ- 
ment: A critical survey of recently pub- 
lished material on twins and foster children. 
Soc. Sci. Res. Council Bull., 1941, No. 47. 

Youwc, FLoRENE M. An analysis of certain 
variables in a developmental study of 
language. Genet. psychol. Monogr., 1941, 
23, 3-141. 


(Received December 17, 1960) 


Psychological Bulietin 
1962, Vol. 59, No. 2, 153-160 


RADIATION RESEARCH IN PSYCHOLOGY: 
AN ANALYSIS OF TECHNIQUES IN MAZE EXPERIMENTATION 


SYLVAN J. KAPLAN 
Texas Technological College 


The past 10 years of work, involving 
the study of the effects of X radia- 
tion upon the learning and retention 
of the rat has introduced many vari- 
ables which are in need of careful re- 
examination. 'The experimental find- 
ings have been fairly consistent in 
showing that the unborn irradiated 
organism is more drastically affected 
in its learning ability than is the post- 
natally irradiated subject (Furcht- 
gott, Echols, & Openshaw, 1958; 
Kaplan & Harris, 1957; Kaplan, 
Tait, Wall, & Payne, 1951; Levinson, 
1952; Levinson, 1960; McCutchan, 
1957). Likewise, it has been shown 
that the learning ability of the ne- 
onate is more adversely affected by 
X radiation than is the adult rat 
(Kaplan, Tait, Wall, & Payne, 1951; 
Levinson & Zeigler, 1959; Tait, Wall, 
Balmuth, & Kaplan, 1952). 

Several learning instruments have 
been used to study these behavioral 
effects of X ray. Among these, at 
least six different mazes (Arnold, 
1952; Blair, 1958; Blair & Arnold, 
1956; Fields, 1957; Fowler, 1957; 
Furchtgott, 1951; Hayes, 1957; Levin- 
son, 1952), operant conditioning 
1 (Kaplan, 1960; Melching, 

957), and a variety of field study 
and psychomotor tasks have been 
. (Furchtgott & Echols, 
" s 1957). Itis the purpose 
p 1 5 paper to isolate one of the 
^ » = ods of study covered by various 
s ors during the decade to point 
^ the weaknesses and capabilities 
* psychologists in answering some of 
za questions in this field which are 

vital concern to radiobiologists. 


` factors, experime 


In the interest of posing the prob- 
lem, the data obtained from studies 
of irradiated rats, wherein the Lash- 
ley III maze has been used, will be 
subjected to a methodological anal- 
ysis. This maze, principally because 
it has been efficient in showing dif- 
ferential effects upon learning as re- 
lated to amount of cortical insult 
(Lashley, 1929), has been used by 
workers studying effects of radiation 
upon learning. In employing the 
Lashley III maze, it generally has 
been accepted that the hypothesis of 
deleterious effects of X radiation 
upon brain function would be sup- 
ported should learning deficits be 
demonstrated. 


THE SCOPE OF THE PROBLEM 


Radiation has been shown to have 
an adverse effect upon learning of 
the albino rat under certain condi- 
tions. It is clear that the interrela- 
tionships between age at time of 
radiation, age at time of testing, dose 
rate, magnitude of dose, are all im- 
portant factors in the amount of 
learning deficit found. It has also 
been clearly shown that there are 
marked individual differences in re- 
sistance to radiation, which implies 
that samples of sufficient size are 
necessary in order to make meaning- 
ful generalizations from data. In 
order to study even a few of these 
nters in the average 
university laboratory find it extremely 
difficult to conduct their experiments 
without confronting difficulties such 
as inadequate space, insufficient stor- 
age facilities for ample numbers of 
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experimental subjects, shortages of 
man power, time, and finances. For 
example, in light of the importance 
of studying animals which have been 
radiated in utero and are not tested 
until 400 or 500 days of age, space 
must be available for storing the 
animals for that period of time. 
Again, to allow for adequate samples 
of subjects radiated with five doses at 
10 different pre- and postnatal pe- 
riods, one must employ very large 
populations of subjects. 

The answer to the problem has 
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been attempted, in part, by inde 
pendent workers in various labora 
tories who have attacked different 
aspects of the same problem. Levin 
son (1952), Levinson and Zeigler 
(1959), Furchtgott, Echols and Open- 
shaw (1958), Tait et al. (1952), Kap- 
lan et al. (1951), Kaplan (1960), and 
Levinson (1960) all have explored 
different aspects of effects of radia- 
tion on performance. Their wo 
has extended over a period from 1950 
to 1960. As can be seen in Table 
many of the variables isolated as 


TABLE 1 
PRE- AND POSTNATAL RADIATION STUDIES BETWEEN 1950 AND 1960 
PRENATAL: Age at Time of Radiation (Days) 


9 10| 11 [12 13| 14 15 16 | 17| 18 19 2021 


6" 


16* 
10* 11? 
16* 
TR 4° 
4? 2*|4*| 6*|4*| 7°) 4° 
16° 
3° 8° 4° 
4° 4° 4? 4? 


Note.—Age of N subjects at beginning of study: + 30 days, ° 45 days, 7 90 days, 400 days. 
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relevant have been investigated, and 
it would seem that the combined 
efforts of these authors should have 
begun to provide many answers to the 
problems under question. In some 
ways, there is no doubt that this is 
so. There is consistency among the 
findings to a remarkable degree. Yet, 
in the interest of better scientific 
methodology, it is important to pre- 
sent the many limiting factors exist- 
ing in the accumulated data which 
make cross comparisons and general- 
izations difficult, if not impossible, 
without much additional experimen- 
tation. 


THE LITERATURE UNDER 
CONSIDERATION 


_ Prenatal Studies. From Table 1, it 
is seen that several papers have ap- 
peared reporting that subjects have 
been X-radiated in utero. These 
studies have received much attention 
from scientists studying radiation 
effects upon the nervous system, par- 
ticularly in view of the histological 
analyses that call attention to the 
times throughout the gestation period 
when specific organs and organ sys- 
tems are peculiarly sensitive to radia- 
tion damage (Hicks, 1950; Hicks, 
1953; Rugh, 1960). 

These experiments have all tended 
to support the position that the sec- 
ond week of gestation reveals the 
most marked deficits in learning,with 
the thirteenth day being one of the 
most critical periods (Levinson, 1952; 
Tait et al, 1952). More recent 
studies, however, have pointed to the 
fact that the sixth and seventh day 
may be even more sensitive to radia- 
T damage, in terms of impaired 
earning, provided the subject is 
tested at a later age in life than that 
usually studied by most experi- 
menters (Kaplan, 1960). 

Postnatal Studies. Rats have been 
radiated from the first day post 
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partum through 100 days of age and 
then tested for learning ability in 
the Lashley III. In general, the 
Bergonie Tribondeau law which states, 
in effect, the younger the tissue the 
greater its sensitivity to radiation in- 
sult (Ellinger, 1946), has been sus- 
tained by the experimental findings. 
Levinson and Zeigler (1959) have 
found that the rats X-rayed on the 
first 2-4 days are less efficient learners 
than those radiated at 24 days of age. 
And Kaplan and Harris (1957) have 
shown that the 100-day-old radiated 
rat will not be impaired by a 100 r.- 
200 r. dose. Levinson and Zeigler 
(1959) and Furchtgott et al. (1958) 
have used these dose levels on younger 
animals and obtained results showing 
impairment of learning. Kaplan and 
Harris have shown it requires an 
exceptionally large dose, (600 r.), to 
impair the learning of the 100-day- 
old radiated rat. 


PROCEDURAL DIFFERENCES 


The primary problem in analyzing 
these findings rests on the variation 
in procedures. Maze and training 
differences in the various experiments, 
criteria for evaluation of results, and 
radiation procedures all involve dif- 
ferences which may possibly affect 
the use of the data for purposes of 
comparing and amplifying work of 
individual authors. 

The Maze. Since Lashley has pro- 
vided only a limited description of 
his maze (Lashley, 1929), the various 
workers in this radio-behavioral field 
have employed modifications of the 
original without recourse to a stan- 
dardized and universally agreed upon 
version of that instrument. Furcht- 
gott et al. (1958) described a maze 
having a linoleum floor, painted grey, 
and a hardware cloth top. 'The mazes 
of Levinson (1952) and Tait et al. 
(1952) differ in the length of the 
entrance and exit alleys, and neither 
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group of authors states whether their 
mazes had floors and top similar to 
that of Furchtgott. Kaplan (1960) 
used a maze 12 inches in height, 
painted black; Levinson's maze was 
34 inches in height and was un- 
painted. Kaplan also employed re- 
trace gates while the other workers 
(Furchtgott et al., 1958; Levinson, 
1952; Levinson, 1960) did not make 
mention of this innovation. 

The Straightaway for Preliminary 
Training. While Furchtgott used a 
3-foot alley runway (Furchtgott et 
al., 1952), Levinson employed the 
Lashley I maze (dimensions not 
given) for preliminary training (Levin- 
son, 1952). Kaplan et al. (1951) and 
Tait et al. (1952) used a 4-foot alley 
runway, and Kaplan (1960) in other 
studies used a 6-foot alley runway. 

Preliminary Training and Testing 
Procedures: Food Reward and Implied 
Differences in Motivation. Workers 
have differed in amount of food de- 
privation employed. Kaplan (1960) 
reported a 234-hour deprivation 
schedule. Furchtgott et al. (1958) 
reported a 22-hour deprivation, while 
Levinson (1952), in one report did 
not include mention of the depriva- 
tion period, and in another (1960), 
mentioned a 23-hour deprivation 
schedule. 

For food reward, mention has been 
made of bran pellets scattered on the 
floor (Furchtgott et al., 1958), three 
moistened Purina lab pellets (Kaplan, 
1960), and a moistened Purina mash 
made into a pellet (Levinson, 1960). 
Some workers allowed 30 seconds for 
the rat to eat at the end of the run- 
way (Kaplan et al., 1951; Tait et al., 
1952), several did not mention time 
allowed for eating (Levinson, 1952; 
Levinson, 1960), while others allowed 
10 seconds for eating at the end of the 

runway (Hicks, 1953). 
Variation in daily feeding for main- 
taining body weight and constant 
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motivation also has been reported. 
Furchtgott et al. (1958) fed their sub- 
jects 1 hour per day, 30-60 minutes 
after the run. Others have not men- 
tioned the feeding schedule (Levin- 
son, 1952; Levinson, 1960; Tait et al., 
1952). 

Preliminary Training. Some work- 
ers reported all subjects are 
handled several days before straight- 
away training is begun (Kaplan, 
1960). Others did not mention 
handling procedures (Furchtgott et 
al., 1958; Levinson, 1952). One re- 
port described preliminary adapta- 
tion as simply a 3-day period in a 
single unit straight alley (Levinson, 
1960). Two trials per day for 6 days 
have been employed as a training 
method by other authors (Kaplan, 
1960). Levinson (1952) reported 
training her subjects daily (number of 
trials per day not stated) until they 
had run three consecutive trials in 
20 seconds. Tait et al. (1952) allowed 
three trials per day for 3 days with 
no time criteria mentioned. Furcht- 
gott reported keeping each subject in 
the runway 10 minutes daily for 7 
days (Furchtgott et al., 1958). 

Hence, for preliminary training, it 
can be seen that degree of docility 
arising from differences in handling, 
and criterion of readiness both become 
factors contributing to an incompara- 
bility between the studies. Learning 
rates subsequent to this training un- 
doubtedly are a function, in part, of 
the levels of motivation attained by 
each rat performing under the re- 
quirements established for its re- 
spective training procedure. 

Training and Evaluation. The ex- 
periments under scrutiny present five 
different procedures for testing the 
subjects in the Lashley III; (a) in 
one experiment (Furchtgott et al., 
1958) rats were given two trials per 
day the first day, three trials per day 
the next 3 days, and five trials per 
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day for the last 2 days; (5) in another 
set of experiments (Levinson, 1952) 
rats were given five trials per day to 
a criterion of two out of three error- 
less trials in 10 seconds; (c) in the 
third group of studies (Tait et al., 
1952), subjects were given one trial 
per day, 6 days per week. In some of 
these, retrace gates were employed; 
in others, they were not. The cri- 
terion of mastery for these studies 
was four out of five errorless trials, 
time of running not considered. (d) 
Ina fourth group of studies (Kaplan, 
1960), rats were given one trial per 
day, 7 days per week to a criterion 
of four out of five errorless trials. In 
this group of studies, running time 
was not considered in the criterion 
of mastery. 

Definitions of Errors. Some authors 
have scored nonretrace and retrace 
errors involving full-body-length entry 
(Levinson, 1952). Others (Kaplan 
1960) have included full body, # 
body, 3 body length forward and 
retrace errors and also “hesitation 
errors.” Kaplan has utilized all of 
these errors in arriving at an “‘error- 
less trial" which obviously represents 
a more stringent criterion than only 
a body length error. 

For the most part, all seem to be 
in agreement on measures of time 
Scores, i.e., from time of leaving the 
— box to entry into the goal 


Radiation Differences. The differ- 
ences between studies cited here refer 
to differences involving comparative 
psychology methodology. An equally 
Serious variability in radiation pro- 
cedures prevails (Table 2). 
T t has been shown that with 

onkeys, a rate of 15 r/min (Kaplan 
8 1953) as compared with 
Noth r / min (Kaplan, Melching, Reid, 
Eno eb & Johnson, 1960) has a 
ju edly different effect upon be- 

Mor. Unless relative biological 


effects (RBE) can be equated, com- 
parisons of maze behavioral studies 
will be affected adversely (Glasser, 
Quimby, Taylor, & Weatherwax, 
1952; Pickering & Hekhuis, 1960). 
(For example, it has to be estab- 
lished that the differences in rate 
which range from 10 r/min to 116 
r/min produce comparable biological 
effects with dose held constant.) The 
radiobiologists and physicists have 
pointed out that there is a markedly 
different quality to the radiations 
produced by X-ray machines of 
highly different voltages! (Glasser et 
al., 1952; Pickering & Hekhuis, 1960; 
Rugh, 1960). The target distances, 
reported in the experiments under 
consideration, vary between 50 and 
91 centimeters. Variations in dis- 


1 "Regarding the enclosed data [Table 2]a 
number of things must be said. 

1. The KVP variations will affect the 
quality of radiation, and consequently its 
penetrability. For instance, a rough estimate 
would be that at 80 KVP the penetration to 
a gravid rat uterus would be about 20 per 
cent less than with 250 KVP. This is a very 
rough estimate. The way to check is with a 
phantom. , 

2. The amperage has nothing to do with 
the quality of radiation and if a Victoreen 
dosimeter were used to determine roentgens, 
at a specific point, this would not be appre- 
ciably altered by altering the amperage. 

3. In several instances no mention 1s made 
of filtration, either added filtration or in- 
herent filtration of the x-ray tube. This is a 
serious omission. If no filtration is used at 
least that of the tube should be mentioned. 

4, There is a trend now which will soon 
become insistent that radiation be 
in rads, not roentgens. This refers to ab- 
sorbed dose and not air dose, and is generally 
somewhat less because of attenuation. — 

ens are air dose measurements, generally, 
but in biology we are interested in absorbed 


dose. 
Thus, some of these data are useless because 


they are not reproducible. [The data being 
insufficient.] Further, the data are not com- 
parable when the KVP range is from 80 to 
250. The data might be more comparable if 
single cells were involved, in a mono-layer of 
cells, but certainly not ina rat." (R. Rugh, 
personal communication, October 26, 1960. 
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TABLE 2 
Dostmetay DIFFERENCES BETWEEN EXPERIMENTS 


Target 
Rate distance 
41/min 
£0.82 200 
300 
$0/min 50 
100 
150 
200 
upper and 
lower tubes 
25/ min 25 
upper tube 
only 
10/min 10 
upper tube 
only 
30/min 0 
90 
180 
360 
600 
116/min 0 
300 
375 
450 
600 
6.0/min 0 
to 150 
6.2/min 350 


30/min 


tance may not be as significant as the 
varying amperage of the machines 
(4 ma.—30 ma.), but it is still impor- 
tant to recognize that the ionizing 
capability of the 80 KVP X-ray 
machine that Levinson (1952) em- 
ployed to produce her behavioral 
findings with rats given 300 r., may 
be different from that of the 170 
KVP machine used by Furchtgott et 
al. (1958), or the 250 KVP machine 


KVP | amperes 


Reference 


Filtration 


Furchtgott, 
Echols, & 
Openshaw, 1958 


Kaplan, 1960 


Kaplan, Tait, 
Wall, & Payne, 
1951 


Levinson, 1952 


Levinson & 
Zeigler, 1959 


Tait, Wall, 
Balmuth, & 
Kaplan, 1952 


used by Tait et al. (1952). Both of 
the latter experiments gave approxi- 
mately the same total body dose but 
with different amperage, rate, and 
target distance. 

Some workers report how they 
compute target distances; others do 
not. Some report the filtration in- 
cluded in the radiation source; others 
do not. Some employ phantoms for 
determining relative biological dos- 
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age given; some measure dosage in 
air. Some report deviation of meas- 
urement; others do not. 


Discussion 


The implications of this analysis 
cannot be ignored. In spite of the 
fact that each study has employed 
normal control subjects and has pro- 
vided needed information on the 
problems investigated, much valu- 
able cross referencing has been lim- 
ited because of the differences in 
procedures employed. With so many 
questions yet unanswered, it would 
seem valuable to the profession if 
those working in the field could agree 
to standardize some of these experi- 
mental conditions while permitting 
latitude in the study of the particu- 
lar problems not contingent upon 
this standardization. 

If the Lashley III maze is to be 
used for radiation study, it is sug- 
gested that its characteristics be as 
constant as possible. Likewise, pro- 
vided they do not interfere with the 
goals of the study, it is suggested 
that the procedures for preliminary 
training and testing be standardized 
as to handling, food reward, and 
measures of learning. The same 
recommendation is submitted with 
regard to the employment of X- 


159 


radiation sourcesso that the compara- 
bility problem may be controlled as 
far as is practicable. Such a stan- 
dardization would certainly lead to 
the acquisition of data which would 
establish greater validity for the 
Lashley III maze as an instrument of 
measurement of radiation effects, 

Similar standardization should be 
established for all other instruments 
used in radiation studies. Perhaps 
second to the Lashley III maze, the 
Tolman-Honzik or Stone multiple T 
maze is the most extensively em- 
ployed (Arnold, 1952; Blair, 1959; 
Blair & Arnold, 1956; Fowler, 1957; 
Harris, 1956; Harris & Kaplan, 
1957). Experimental studies using 
this maze should be examined for 
possible adjustment of standardiz- 
able factors. The same arguments 
apply to the procedures under study 
by workers using operant equipment. 
In the operant field, perhaps, it is 
premature to make such an effort, 
since the operational techniques are 
still in process of being established. 
Still, since several workers have been 
using this technique in studying 
learning after radiation (Kaplan, 
1960; Melching, 1957) their atten- 
tion is called to the necessity for 
standardizing apparatus and pro- 
cedures here. 
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The | comparative - physiological 
trend in the study of the functions of 
the organism arose in the nineteenth 
century and developed under the in- 
fluence of evolutionary views on the 
origin of animals and man. Com- 
parative physiology of the higher 
nervous activity is based on the re- 
flex of I. M. Sechenov and I. P. 
Pavlov, as well as on the evolutionary 
theory of Charles Darwin. The pres- 
ent article describes the most im- 
portant results of the comparative- 
physiological study of higher nervous 
activity carried out by our labora- 
tory; it also gives an interpretation 
of these results.? 

I believe that not only the very 
ERU of our method of investiga- 
a im be of interest to the reader, 

* also the theoretical propositions 
b ich underlie the fundamental idea 
9! our research into both common 
8 different features of higher nerv- 
d Ny in various species of 

ebrates. These propositions were 


Reprints of thi i i 
f : is article may be obtained 
1 Editor of the Psychological Bulletin. 
1 ave utilized here part of the experi- 
onis material accumulated by my col- 
191 s as well as by myself at the Pavlov 
ot saute of Physiology of the USSR Academy 
Jd e. (Leningrad) in 1950-54, at the 
050 b of Higher Nervous Activity and 
9— ysiology of the USSR Academy of 
Dinos in 1957-60, and at the Moscow 

osoy State University in 1954-60. 


formulated in the laboratories of 
Sechenov and Pavlov and may be re- 
duced in brief to the following: 

1. The work of the nervous system, 
including its higher division—the 
brain—is effected reflexly. It can be 
made fully known through the devel- 
opment of the reflex theory based on 
the principle of determinism (accord- 
ing to which there are no causeless 
reactions in the organism), the struc- 
tural principle (there are no func- 
tions in the organism which are not 
adjusted to the structure), and the 
principle of analysis and synthesis 
(the nervous system is capable of dif- 
ferentiating and integrating all the 
influences to which it is subjected, as 
well as the activity of the organism 
arising in response to these influ- 
ences). 

2. The method of confronting ex- 
ternal influences falling on the organ- 
ism with its corresponding response 
makes it possible to understand the 
nervous mechanism involved in this 
response. 

3. The conditioned reflex is a uni- 
versal mechanism of activity ac- 
quired in the course of the organism's 
individual life. The researcher can 
investigate the laws of development 
of this reflex (beginning with its most 
elementary form) and of its gradual 
complication in onto- and phylo- 
genesis. * 
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4. In the course of evolution of the 
animal world there took place only a 
quantitative growth or complication 
of higher nervous activity. The 
latter acquired qualitative distinc- 
tions as a result of the transition of 
the organism to the specifically hu- 
man mode of interaction with the 
surrounding medium. 

All these propositions undoubtedly 
require further development, espe- 
cially in connection with the progress 
of the analytical methods of modern 
physiological investigations and with 
the bright prospects which they open 
before researchers. 

Comparative physiological investi- 
gations of higher nervous activity 
conducted by our laboratory are by 
no means unique in the Soviet Union. 
Considerable and fruitful work in 
this field is done by the laboratory of 
D. A. Biryukov (1959, 1960) at the 
Leningrad Institute of Experimental 
Medicine, by A. D. Slonim? and 
M. E. Lobashev at the Pavlov 
Physiological Institute, by the lab- 
oratory of A. B. Kogan (1959) at the 
Rostov University, and by numerous 
laboratories of other scientific estab- 
lishments. 


SUBJECTS AND METHODS 
OF INVESTIGATION 

The subjects of our investigations 
are: fishes, tortoises, pigeons, rooks, 
hens, ducks, rabbits, rats, dogs, 
macaques (Macacus rhesus), green 
monkeys (Cercopitecus aetiops), ba- 
boons (Papio hamadrias), capuchins 
(Cebus apella), chimpanzees (Pan 
shimpanse), and human beings. 

In choosing this series of subjects 
we proceeded on the basis of the 
gradual progressive complication of 
their brain structure. The degree to 
which the given animal could be in- 
vestigated in laboratory conditions 
was also of importance. 


3 See the monograph of K. M. Bykov and 
A. D. Slonim (1960). 
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In natural conditions all the above 
mentioned animals procure their 
food with the help of their ex 
tremities—teeth, beaks, and jaws— 
they dig up the ground, turn over 
stones, break the branches of trees, 
etc. 

According to comparative morpho- 
logical investigations carried out by 
A. N. Severtsev, characteristic of all 
the vertebrates is an oral cavity 
apparatus of the grasping type 
“This apparatus proved to be an ex- 
tremely plastic organ, i.e., an organ 
capable of adapting itself to highly 
diverse food variations" (Severtsey, 
1934, p. 29). However, already in 
lower vertebrates (amphibians and 
reptiles) the fore extremities serve as 
auxiliary organs for grasping food, 
while in higher animals (for example, 
in monkeys) they turn into the most 
important apparatus of food procure- 
ment. 

In view of this great importance of 
the motor function of the oral cavity 
apparatus and of the extremities, we 
have applied the method of food- 
seeking conditioned reflexes (Baru, 
1953a; Chernomordikov. 1953; 
Malinovsky, 1952 a; Prazdnikova, 
1953a; Voronin, 1954a). According 
to this method, a definite accidental 
or experimentally provoked move- 
ment performed by the animal, such 
as pressing a button or a pedal with 
one of the extremities or with the 
beak, catching a ring or a suspended 
bead with the jaws, was accompanied 
by the act of feeding 

After the animal performed a cor- 
responding movement at the sight of 
a certain manipulator (a pedal, à 
ring, a bead, etc.), food was ad- 
ministered only if the animal's action 
coincided with a definite external 
signal (auditory or photic). Thus, in 
response to the action of a light sig, 
nal, for example, the experimen 
fish caught a bead with its jaws; the 
turtle seized a ring; the bird began to 


k carrion; the dog pressed a pedal 
h its forepaw; the monkey pressed 
lever, a button, or some other 
Smanipulandum. 
The action of the stimulus and the 
eee motor reflex were re- 
corded on a moving band of the 
= kymograph. 
A similar motor conditioned reflex, 
‘for example, the reflex of pressing a 
Button or a lever, was established also 
in man. In this case, however, it was 
reinforced not by food, but by the ac- 
tion of a signal stipulated by a pre- 
liminary instruction; the subject was 
fold that if he performed the right 
movement, this would be confirmed, 
- for example, by the emergence of a 
red light (Rokotova, 1954b). 
| Food-seeking conditioned reflexes 
Pere established not only to single 
Sound or light stimuli, but also to 
“complex stimuli consisting of simul- 
taneously or successively acting 
bound and light agents. 
r Finally, with the view of studying 
dhe laws of analysis and synthesis of 
2 Proprioceptive stimulations arising 
in the course of motor reactions, we 
 &stablished chains of motor condi- 
| tioned reflexes, i. e., series of consecu- 
E tive movements in response to 
. Chains of successive sound and light 
stimuli. 
. — The formation of chains of motor 
_ Conditioned reflexes was effected in 
3 two Ways: either as a result of a con- 
_ ecutive combination of already 
_ fstablished homogeneous or hetero- 
geneous food-seeking movements 
| (Rokotova, 1954; Voronin, 1947), 
Ra ES consecutive addition of acci- 
m al or experimentally provoked 
vements to previously formed 
Bod seeking reactions (Voronin & 
Napalkov, 1959, 1960). 
In the first case, for example, the 
By ental rabbit or dog reacted 
S ek; light of an electric bulb by 
Bu 2 aring with the teeth; at the 
“ound of a bell it tapped a pedal with 
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its forepaw; at the sound of a metro- 
nome it made a movement with the 
jaws aimed at catching another ring. 
When all the conditioned agents were 
combined into a chain of stimuli, the 
animal reacted to it with a cor- 
responding series of movements, 
after which it was given food. 

In the second case, the light of an 
electric bulb was switched on after 
the accomplishment of a previously 
established reaction, for example 
after tapping a pedal with the fore 
extremity in response to the sound of 
a bell; if the animal performed an- 
other movement which corresponded 
to the scheme of the experiment, this 
was reinforced by the administration 
of food. In this way a chain of two 
movements was formed. The same 
principle was applied in the establish- 
ment of a chain reaction consisting of 
three, four and more movements. A 
conditioned inhibitor may be formed 
to such a chain of movements by 
means of a systematic nonreinforce- 
ment of the chain reaction if the 
latter takes place in the presence 
of a certain (conditioned inhibitory) 
agent. Then it is possible to establish 
a chain of two movements which are 
provoked by external stimuli and 
which eliminate the conditioned in- 
hibitory agent; after that the positive 
conditioned signals and the food- 
seeking movements become effec- 
tive. ane 

The same principle was applied in 
the establishment of still more com- 
plex chain reactions consisting of ali- 
mentary and defensive movements. 
For example, an alimentary chain of 
movements consisting of three com- 
ponents has been established in a 
dog; but the animal does not perform 
this series of movements in the 
presence of an acting signal which is 
followed by a harmful stimulus, for 
example, by a blow on the back with 
a wire clamp. By switching off the 
signal of the harmful stimulus and by 
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utilizing this as a reinforcement, we 
can establish a new chain of move- 
ments; the animal performs a con- 
secutive series of movements which 
lead to the elimination of the signal of 
the harmful stimulus and, con- 
sequently, of the stimulus itself. 
After that the unimpeded perform- 
ance of a chain of food-seeking move- 
ments by the animal leads to the 
obtainment of food. 

In a human subject we establish 
in these cases not food-seeking reac- 
tions, but movements aimed at solv- 
ing certain tasks stipulated by 
preliminary instructions. Thus, the 
subject was asked to switch on a def- 
inite signal (for example, a red 
electric bulb) by means of perform- 
ing any suitable movements on the 
switch panel. It is this bulb that 
served as a reinforcement, as a signal 
for the correctness of the subject's 
movements. 

If the subject performed a series of 
movements which corresponded to 
the established scheme of the experi- 
ment, these movements were rein- 
forced. Besides, conditioned inhibi- 
tors of three kinds were established to 
the chain of movements; a successive 
elimination of each of these inhibi- 
tors with the help of two or three 
movements made it possible to solve 
the task set by the preliminary in- 
struction. 


ELEMENTARY Foop-SEEKING 
REACTIONS 


Numerous investigations carried 
out in Pavlov's (1926) laboratories 
showed that the formation of positive 
conditioned reflexes (reinforced by 
food or by some other unconditioned 
stimulus) and of inhibitory or nega- 
tive (unreinforced) reflexes was based 
on the interaction of the processes of 
excitation and inhibition in the brain. 
Both the excitatory and inhibitory 
processes possess definite strength 
and mobility, and are equilibrated. 
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By establishing the rate of formation 
of positive conditioned reflexes we 
can determine the strength of the 
process of excitation and its equilib- 
rium with the process of inhibition. 
The rate of formation of a negative 
reflex, i.e., of any kind of internal in- 
hibition (extinction, differentiation, 
conditioned inhibition, retardation) 
may serve as an index of the strength 
of inhibition and of its equilibrium 
with the process of excitation. 

In this connection, a series of in- 
vestigations of elementary food-seek- 
ing conditioned reflexes in vertebrates 
of different levels of phylogenesis 
were undertaken in our laboratory. 
The results of numerous researches 
carried out by our coworkers and 
published in various magazines, as 
well as in special articles written by 
myself (Voronin, 1954a, 1954b, 1955) 
show that the rate of formation of ac- 
tive conditioned reflexes in verte- 
brates is approximately the same, and 
consequently cannot reflect the level 
of phylogenesis. The data presented 
in Table 1 corroborate the point of 
view of D. A. Biryukov (which was 
expressed several years prior to us) 
that the rate of formation of a motor 
reflex, such as the movement of the 
experimental animal towards the 
place of feeding at a definite signal, 
is not a reliable criterion for ascer- | 


TABLE 1 
RATE OF FORMATION OF A 
CONDITIONED REFLEX 
Number| after which hich 
Animals | of the th 


Goldfish 29 
en 15 
Rabbit 36 
21 
Green monkey, 
baboon, 
macaque 14 
Chimpanzee 5 


Note.—This table, as well as all others, show the limits 
wi which the rate of formation of a condition 
variesin each animal, 
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TABLE 2 
Rate or FoRMATION OF DIFFERENT 
KINDS OF INTERNAL INHIBITION 


| Number of applications of the 
stimulus after which internal 
inhibition was formed 
Animals Condi: 
Extinc- | Differen-| tioned 
tion tiation inhibi- 
tion 
Goldfish 28-78 20-40 15-20 
Steppe and 
Greek 
tortoises 6-15 12-40 7-143 
Jackdaw — 20-40 6-8 
Hen 11-84 10-30 5-15 
Rabbit 10-30 6-13 12-26 
Dog 10-18 4-18 — 
Baboon, 
macaque 7-15 3-25 8-12 
Chimpanzee| 8-10 9-30 3-6 


Note —The double figures indicate the limits within 
— — the rate of development of internal inhibition varies. 
he number of animals are the same as in T. 1. 


taining the specific features of higher 
nervous activity of different levels of 
development. 

A A comparison of the rates of forma- 
tion of negative conditioned reflexes 
shows that in this case a certain 
difference is in evidence (Table 2). 

Thus, the strength of the excita- 
tory process in vertebrates in the 
case of an elementary food-seeking 
Conditioned reflex is equal. As to the 
Strength of internal inhibition, it 
somewhat increases in the ascending 
line of vertebrates, as shown by ex- 
periments performed in similar condi- 
tions, 
$ In view of this, one might draw the 

Onclusion that in the course of 
ME oos the equilibration of 
ela and inhibition somewhat 
is pe However, this conclusion 

isproved by the fact that the 
Honey of excitation in monkeys 
be erably predominates over the 
SA of inhibition, owing to which 
an animals are highly excitable 
unequilibrated. Thus, we can 


speak only of a certain tendency to- 
wards an increase in the strength of 
inhibition and equilibration of the 
nervous processes which in a number 
of cases may not overstep the limits 
of individual variations in animals of 
any species, 

In order to reveal certain distinc- 
tive features in the higher nervous 
activity of different species of ani- 
mals, we also investigated the mobil- 
ity of the nervous processes, the 
capacity of each of these processes to 
turn into its opposite, i.e., the capac- 
ity of excitation to turn into inhibi- 
tion and vice versa. 

We determined the mobility of the 
nervous processes by way of chang- 
ing the signaling properties of the 
positive and negative conditioned re- 
flexes. For this purpose, after the 
stabilization of the reflexes the se- 
quence of their reinforcement was 
changed; the positive reflex was not 
reinforced at all, while the negative 
reflex was. Owing to this, after the 
lapse of a certain time the signaling 
properties of the stimuli changed into 
their opposites, and consequently, 
the conditioned reflexes themselves 
were, in our laboratory terminology, 
"reversed." In order to determine 
the capacity of the animals to endure 
an overstrain of the mobility of the 
nervous processes, and to establish its 
susceptibility to training, the condi- 
tioned reflexes were consecutively 
reversed several times in each species 
of animals. à 

Table 3 shows that in most ani- 
mals, judging by the first "reversal" 
of the reflexes, the mobility of the 
nervous processes is almost equal. 
Only the monkeys, and particularly 
the chimpanzees, constituted an ex- 
ception: in this case the reversal was 
accomplished very rapidly. 

However, further changes in the 
signaling significance of the stimuli 
showed that fish and tortoises differ 
greatly from other animals: the re- 
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TABLE 3 
Numoer or SriwvLATIONS Necessary TO “Reverse” CONDITIONED REFLEXES 


E ————— ———— —————————————————————— 


— Fish Tortoises Jackdaws Rabbits Dogs Baboons Chimpanzees 
1 30 25-30 41-120 47-107 33-306 9-58 4-6 
2 151 21-124 26-0900 17-32 2-11 2-17 34 
3 failure 22-25 21-70 31-90 427 3342 24 
4 — 23-25 21-70 25 2-24 2 1 
5 — failure 38 13 3-19 2-3 1 
6 — ae ws 11 1-2 1-2 1 
7 ET ZH — 49 1-2 1-2 1 
8 = = 2 = 24 2 zi 
9 == — — — 2 1 = 


Note.—The sign (—) used in this and other tables means that no test was performed. 


flexes proved to be reversed only in 
one crucian out of the three which 
were subjected to experimentation; 
in the two others no reversal took 
place even for the first time, be- 
cause after 178-200 stimulations the 
conditioned reflexes disappeared. In 
the case of tortoises only two (out of 
five) exhibited a reversal of the re- 
flexes; in the other tortoises after 70— 
100 stimulations the conditioned re- 
flexes began to manifest themselves 
very irregularly and, finally, fully 
disappeared. 

Thus, our investigations of the 
mobility of the nervous processes in a 
series of vertebrates have revealed 
marked distinctions in this highly 
important property of the nervous 
processes on different phylogenetic 
levels. 

On the basis of these facts we con- 
sidered it possible to draw the follow- 
ing conclusion (Voronin, 1953): 
the more developed the nervous system of the 
animal and the greater the mobility of the 
processes of excitation and inhibition, the 
more rapidly can the animal be trained, the 
more resistant is the nervous system to the 
frequent and sharp changes of external in- 
fluences (p. 53). 


Our experiments have also shown 
that whereas in the case of higher 
animals the process of reversing the 


signaling properties of the stimuli of 
food-seeking reflexes (with which we 
were concerned in our experiments) 
presents a very elementary task, for 
lower vertebrates (crucians, tortoises) 
it is, apparently, a rather complex 
procedure which approximates the 
limit of their capacity to “learn” in 
the given conditions. 

This conclusion is particularly cor- 
roborated by repeated reversals of 
the signaling properties of stimuli, 
i.e., in conditions of training, or to 


' be more prescise, in conditions in- 


volving an overstrain of the lability 
of the nervous processes. 

In this connection we specially in- 
vestigated the process of training 
extinctive inhibition in fish, tortoises, 
hens, and rabbits (Polivannaya, 
1960); in dogs and rabbits (Bez- 
nosikov, 1954); and in baboons 
(Voronin, 1951). 

The investigations showed that is 
the extinction of the conditioned re- 
flex is performed day by day, a lesser 
and lesser number of repeated stim- 
ulations unreinforced by food is re- 
quired to extinguish the conditioned 
reflex. Finally (after a certain num- 
ber of experimental days), it is 
sufficient to apply the unreinforced 
Stimulus only once, and no subse- 
quent application will evoke the con- 
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TABLE 4 


TRAINING or EXTINCTIVE INHIBITION 
—n—ns— 
Number of experiments 


Li 


Animals required for training 
Crucian 68 

Tortoise 50 

Hen 36-54 
Rabbit 12-38 

Dog 7-8 
Baboon 10-12 


ditioned reflex reaction. 

As shown in Table 4, in dogs and 
baboons this phenomenon manifests 
itself after a period of 7-12 days, 
while in lower vertebrates (crucians, 
tortoises) a considerably greater 
number of experimental days is re- 
quired. 

When the rate of training extinc- 
tive inhibition was established, we 
had to find out whether this process 
exerted influence on the extinction 
of reflexes from other analyzers. For 
this purpose, we compared the 
dynamics of extinction of reflexes to 
Sound and light stimuli before and 
after training extinctive inhibition 
of a reflex to one of the stimuli. 

The results of some of these ex- 
periments (Beznosikov, 1954; Poli- 
vannaya, 1960) are given in Table 5, 
Which show that the training of 
extinctive inhibition of one reflex 
€xerts positive influence also on the 
Process of extinction of another re- 
flex in animals in different levels of 
- Phylogenesis. For example, in the 
case of Crucian Number 5 (Table 5) 
the first extinction of a conditioned 
reflex required 37 applications of a 
green light, 21 applications of a red 
light, and 78 applications of a bell. 
After the reflex was extinguished 
chronologically, day by day, the 
Phenomenon of extinction began to 
manifest itself right after the first 
1 nonreinforcement. This training of 


TABLE 5 


EFFECT OF TRAINING ExTINCTIVE INHIBITION OF ONE REFLEX ON THE RATE OF EXTINCTION OF OTHER REFLEXES 


HIGHER NERVOUS ACTIVITY 


111 
LI 
1014 
rri 
ISIN 


- ! ene 


1117110 
eev| 1-7 


EEES ig 
111181 


The stimulus tested 
Gurgle Siren 
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i 181000 


(Number oſ applications oſ the stimulus) 
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the process of extinction of a condi- 
tioned reflex to a green light greatly 
hastened also the process of extinc- 
tion of reflexes to other stimuli. 

The influence exerted by the train- 
ing of extinctive inhibition proves to 
be strongest in the extinction of a re- 
flex elaborated fom one and the same 
analyzer. 

Up to the present time the mech- 
anism involved in the training of ex- 
tinctive inhibition is not quite clear 
tous. We believe that the matter is 
reduced to the formation of a condi- 
tioned reflex to a certain attribute of 
the experimental procedure which 
signals the absence of a reinforce- 
ment. It is the first application of a 
conditioned reflex without any sub- 
sequent reinforcement that becomes 
such a signal. After repeated applica- 
tions of the stimulus without rein- 
forcement this turns into an attribute 
signaling the absence of food. It is 
probable that the role of such a signal 
is played by the segment of time after 
which the subsequent conditioned re- 
flex begins to act; for example, in our 
scheme of extinction of a conditioned 
reflex this took place in 30 seconds. 
Consequently, the time factor may 
also become a conditioned signal. 

As is known, the extinction of a 
reflex is more rapid the greater the 
lability of the nervous processes. 
Owing to this, experiments with 
training extinctive inhibition corro- 
borate the results of our comparative- 
physiological investigations of the 
lability of nervous processes and our 
conclusion that the lability of ex- 
citation and inhibition perfected it- 
self in the course of evolution. More- 

over, they allow us to draw the same 
conclusion concerning the strength of 
the inhibitory process as well as a 
similar hypothetical conclusion con- 
cerning the excitatory process. 
Besides data relating to the perfec- 
tion of the excitatory and inhibitory 
processes in the course of evolution, 
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we were able to accumulate some 
material concerning one of the most 
important adaptive properties of the 
nervous system—the stability of the 
stimulations that are imprinted in it, 
i.e., concerning the persistence of the 
traces of stimulations. 

This property underlies both the 
coupling functions of the higher divi- 
sions of the central nervous system, 
i.e., the formation of temporary con- 
nections of conditioned reflexes, and 
the preservation of these reflexes, 
ie. memory. 

The question of the persistence of 
traces of stimulations first arose for 
us when we studied the phenomenon - 
of conditioned inhibition in animals 
in different levels of phylogenesis 
(Chernomordikov, 1953;  Firsov, 
1953; Malinovsky, 1953a; Prazdni- 
kova, 1953b). It was established 
that if a conditioned inhibitory agent 
begins to act prior to a positive 
stimulus and if the interval between: 
them is equal to 5-10 seconds, no 
conditioned inhibition is formed in 
lower vertebrates (fish and tortoises). 
We assume that the faculty of the 
nervous system of these animals to 
preserve traces of stimulation is very 
little developed, so that 5-10 minutes 
after the exclusion of the additional 
agent its traces cannot acquire a con- 
ditioned-inhibitory signaling prop- 
erty. In order to verify this assump- 
tion, we carried out a comparative. 
physiological study of a conditioned 
reflex to time; as is known, this reflex ‘ 
is assigned to the group of trace con- 
ditioned reflexes, i. e., temporary - 
connections formed to the traces of 
stimulation. To establish this re- 
flex, we fed the animal at definite in- 
tervals, for example, every 2 minutes, 
and all the accidental or experi- 
mentally induced movements which 
coincided with the act of eating be- 
came conditioned. ji 

Table 6 shows that the rate of for- 
mation of a conditioned reflex to 
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TABLE 6 


Rate or FORMATION AND STABILIZATION OF A 
CONDITIONED REFLEX TO TIME 


Number of combinations 
3 (maximum and minimum) 
1 ol required for: 
exper.: ;ñ᷑— 
Animals mental | First manifes- | Stabilization 
animals | tation of the 
conditioned | conditioned 
reflex reflex 
Crucian 9-15 failure 
Tortoise 8 24-40 failure 
Hen | 12 7-45 132- 
Rabbit 8 4-50 100-150 
5 9-15 60-120 
Monkey 
(macaques, 
boons) 14 5-13 90-120 


time, with an interval of 1 or 2 
minutes, is almost the same in all of 
our experimental vertebrates. How- 
ever, the stabilization of the condi- 
tioned reflex was dependent on the 
phylogenetic level of development of 
the nervous system (Baru, 1953; 
Bolotina, 1953; Chernomordikov, 
1953; Prazdnikova, 1953). Thus, in 
crucians and tortoises the condi- 
tioned reflex did not become stabi- 
lized at all; it remained unstable and 
was masked by intersignal reactions; 
in a number of cases it was absent in 
spite of 200 combinations of the 
animal’s movement with feeding at 
intervals of 1 or 2 minutes. With 
great difficulty, i.e., only after 100— 
250 combinations, was the condi- 
tioned reflex to time stabilized in 50% 
of all the experimental rabbits and 
ens, 

The conditioned reflex was most 
rapidly stabilized in dogs and mon- 
keys. Besides, in these animals, in 
comparison with others, the reflex was 
OM susceptible to external inhibi- 
T. and its extinction was more pro- 
1 Finally, the duration of the 
á e interval to which the condi- 
toned reflex could be formed best 
iiir to be different in various ani- 
1 For example, in fish, tortoises, 
mo 5 rabbits the most distinct re- 
des were formed to intervals of 1 

minutes; in monkeys—to inter- 
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vals of 2-5 minutes; and in dogs—to 
intervals of 4-6 minutes. In monkeys 
and dogs a conditioned reflex could 
be established to an interval of 11 
minutes, while in our other experi- 
mental animals the number of ade- 
quate reactions in the course of the 
experiment already showed a decline 
at an interval of more than 2-5 
minutes, and subsequently fully dis- 
appeared. 

Thus, we observed a direct de- 
pendence of the rate of stabilization 
of a conditioned reflex to time, as well 
as the degree of its distinctness and 
the conditions of its formation, on the 
phylogenetic level of development 
of the nervous system; this allows us 
to draw the following conclusion: 
the more developed the nervous sys- 
tem, the longer it retains traces of 
stimulation. 

This point of view is corroborated, 
along with the above mentioned 
facts, by the results of accidental 
and other observations in our labora- 
tory on the persistence of conditioned 
reflexes after certain intervals in 
their training. 

For example, we observed a com- 
plete recovery of a complex system of 
motor conditioned reflexes in a rhesus 
macaque after an interval of 8 years 
in experimentation (Voronin & 
Shirkova, 1949). 

The persistence of food-seeking 
conditioned reflexes was observed in 
hens after an interval of 3 months; 
fish manifested a deranged differenti- 
ation of inhibition after an interval of 
3-5 days (Prazdnikova, 1953a). 

Special experiments performed 
on crucians, pigeons, and rabbits 
(Chumak, 1958) showed that the 
higher the phylogenetic level of de- 
velopment of the nervous system, the 
less influence is exerted by an in- 
terval in experimentation on the 
preservation of a food-seeking con- 
ditioned reflex. 

Thus, on the basis of the experi- 
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mental material we can state that the 
food-seeking elementary conditioned 
reactions, with which we dealt in our 
experiments, constitute a universal 
phenomenon in vertebrates, a phe- 
nomenon that is formed in the in- 
dividual life of the animals according 
to the laws of the conditioned reflex. 
This phenomenon is so universal and 
at the same time so elementary that 
in this respect it is impossible to es- 
tablish any distinctions with respect 
to different levels of phylogenesis of 
the higher nervous activity of verte- 
brates. 

Apparently, the properties of the 
excitatory and inhibitory nervous 
processes on all the phylogenetical 
levels that were investigated by us 
are so developed that they insure the 
accomplishment of these elementary 
food-seeking reactions reactions 
which adapt the organism to its 
usual conditions of existence. But, 
as shown by special experiments, 
any sharp change in these conditions 
exerts different influences on the 
conditioned reflex activity of various 
animals. For example, overstrain of 
the strength and lability of nervous 
processes makes it possible to reveal 
marked phylogenetical distinctions. 

The fact is that the more these 


L. G. VORONIN 


properties of nervous processes are 
developed, the more developed is the 
higher nervous activity of the ani- 
mals. This suggests the idea that it 
is the perfection of the properties of 
the excitatory and inhibitory nervous 
processes that lies at the base of the 
evolutionary perfection of higher 
nervous activity. In its turn, the 
development of the properties of 
nervous processes underlies the per- 
fection of the general mechanisms of 
higher nervous activity. We en- 
deavored to disclose this interrclation 
by means of a comparative-physio- 
logical study of different kinds of 
conditioned reflexes. Thus, we in- 
vestigated conditioned reflexes to 
time (Bolotina, 1952a, 1952b, 1953; 
Voronin, 1951; & others), temporary 
connections between indifferent stim- 
uli (Malinovsky, 1953b; Rokotova, 
1952, 1953, 1954a), conditioned re- 
flexes of second order (Brau, 1953b; 
Malinovsky, 1952b; Prazdnikova, 
1953), imitative conditioned reflexes 
(Bogomolova, Saakyan, & Kozoro- 
vitsky, 1956; Kozorovitsky, 1956; 
Voronin, 1947b), and conditioned re- 
flexes to relative attributes of stimuli 
(Chumak, 1957a, 1957b). 

In the course of analysis of the re- 
sults of these investigations we were 


TABLE 7 


EFFECT OF INTERVALS IN EXPERIMENTATION ON THE PRESERVATION 
or Foop-SEEKING CONDITIONED REFLEXES 


Number of combinations performed 
before the interval in 
experimentation 


Animals 


Crucian No. 6 
Crucian No. 7 
Crucian No. 8 
Pigeon No. 1 
Pigeon No. 3 
Pigeon No. 7 
Rabbit No. 1 
Rabbit No. 4 


Number of combinations required 
for recovery of the conditioned 
reflex to its initial level 
after the interval 


10 days 20 days 
C 
20 71 
35 84 
43 74 
5 22 
— 23 
7 23 
0 5 
0 5 
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primarily impressed with the uni- 
versal significance of the food-seeking 
reflex, irrespective of its kind. How- 
ever, separate facts obtained as a re- 
sult of our investigations showed that 
the principal mechanism in higher 
nervous activity which developed 
during the course of phylogenesis was 
the mechanism of analysis and syn- 
thesis of stimuli, as well as that in- 
volved in the activity of the organism 
arising in response to stimuli. 


ANALYSIS AND SYNTHESIS 
OF EXTERNAL STIMULI 


An organism at any level of phylo- 
genesis is subjected to a tremendous 
number of influences exerted by the 
external environment; but not all 
organisms possess equally the faculty 
to isolate and analyze the elements of 
these influences, to synthesize them 
into complexes, and to link them 
with one of their own activities. 

In other words, not all organisms 
are equally capable of learning to 
adapt themselves to the external en- 
vironment in the best possible way on 
the basis of the complex signals that 
come from this environment. 

As far back as the early years of 
research (1903-1910) Pavlov's lab- 
oratories demonstrated that the sali- 
vary reflex is formed to a complex of 
stimuli, that the animal is able to 
distinguish a separate component 
from the whole complex and that 
after long training the complex is per- 
ceived as a single whole (Pavlov, 
P. These investigations laid the 
oundation of further thorough re- 
search into the analytico-synthetical 
ume of higher nervous ac- 
ed From the point of view of the 
"BOR we were interested to find out 

€ degree of development of these 
1 in vertebrates at differ- 
d evels of phylogenesis. With this 

in view, and using the same in- 
15 of conditioned reflex activity, 
amely, the food-seeking reaction, we 
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began to apply as conditioned stimuli 
various combinations of agents ad- 
dressed to the same or different analy- 
zers, i.e., to the sense organs in the 
broad meaning of the word, including 
their cortical projections. Depending 
on the tasks in our investigation, we 
communicated either an active or an 
inhibitory signaling property to com- 
binations of stimuli consisting of 
simultaneously or consecutively act- 
ing agents. 

Several variants of conditioned re- 
flexes to complex stimuli were in- 
vestigated in our laboratory in the 
course of a number of years. In 
particular, our laboratory investi- 
gated a simple case of a complex in- 
hibitory conditioned stimulus which 
is termed "conditioned inhibitory 
combination of stimuli" (Cherno- 
mordikov, 1953; Firsov, 1953; Mali- 
novsky, 1953a; Prazdnikova, 1953b). 

In some cases the conditioned in- 
hibitory agent was added to the posi- 
tive stimulus and they acted simul- 
taneously (a simultaneous combina- 
tion). In other cases the conditioned 
inhibitory agent acted first; then its 
action was discontinued and a positive 
agent took its place (a consecutive 
combination). Finally, in a third 
series of experiments these two 
stimuli were separated from each 
other by an interval of 5-20 seconds 
(a consecutive: trace combination). 
In all these cases the combinations of 
stimuli were not accompanied by 
food, as is usually done in the case 
of formation of a conditioned in- 
hibitor. n 

Table 8 shows that the possibility 
and rate of formation of a conditioned 
inhibitor are not the same in different 
animals and in the presence of differ- 
ent conditioned inhibitory combina- 
tions of stimuli. 

Whereas under the simultaneous 
action of a combination of stimuli 
conditioned inhibition is formed in all 
animals, in the case of a consecutive 
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TABLE 8 
Rate or FORMATION of DIFFERENT VARIANTS OF CONDITIONED INHIBITION 


. ²¹ VA 11a . — 


Number of applications of the conditioned-inhibitory combination 
Consecutive-trace combination 
" of stimuli 
Animals Simultaneous Consecutive 
combination combination Interval between stimuli 
5 seconds 10 seconds 
Goldfish 15-20 69 — 
Tortoise 7-143 — — — 
en 5-15 10-70 99 68-110 
Rabbit 14-26 40 58 69-80 
Baboon 8-11 8-120 20-100 35-41 
Chimpanzee 3 2 2 


* Conditioned inhibition was not formed, 


action of stimuli this kind of internal 
inhibition develops in lower verte- 
brates (fish and tortoises) with great 
difficulty, or does not emerge at all. 
Only higher apes constitute an excep- 
tion: in this case a negative tem- 
porary connection is invariably 
formed under any of the tested struc- 
tures of complex inhibitory stimuli. 
The above facts show that the 
analysis and synthesis of a complex 
inhibitory stimulus is feebly pro- 
nounced in fish and tortoises. These 
processes are somewhat better ac- 
complished in hens and rabbits, still 
better in baboons, and they are 
greatly developed in higher apes— 
chimpanzees. Then, with the same 
aim in view, we investigated a more 
complicated form of differentiation of 
complex stimuli, in which one and the 
same agent acquired either a positive 
or a negative signaling property, de- 
pending on the stimulus with which it 
acted simultaneously. For example, a 
positive conditioned reflex was estab- 
lished in the experimental animal to 
Agent A and a negative conditioned 
reflex to Agent B. Then an indiffer- 
ent Stimulus C was added in some 
cases to Stimulus A and in others to 
Stimulus B. The combination of 
Stimuli A+C was not reinforced by 


food, while the combination B+C 
was. After a number of applications 
the combination A+C ceased to 
evoke a conditioned reaction, i.e., 
C began to play the role of a condi- 
tioned inhibitor which inhibited the 
positive signaling property of A. At 
the same time the combination B+C 
began to evoke a conditioned reflex, 
i.e., C acquired the significance of a 
conditional disinhibitor which elimi- 
nated the inhibitory property of B. 

Our collaborators, S. M. Makokina 
and Y. A. Kholodov (1959), investi- 
gated this phenomenon with chim- 
panzees, baboons, and dogs; and 
K. I. Iordanis (1959a) investigated 
it with rabbits, pigeons and tortoises. 
It was found that a differentiation of 
the double property of the same 
agent could be established in chim- 
panzees, baboons, and dogs, though 
with great difficulty. Owing to this, 
the indifferent agent which was added 
to the positive conditioned stimulus 
began to inhibit its effect, while the 
indifferent agent added to the nega- 
tive stimulus began to eliminate its 
inhibitory action. 

This differentiation is less pro- 
nounced in rabbits and pigeons; we 
have never seen a single animal of 
this group properly respond to à 
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were usually observed in the course of 
the same experiment. 

Experiments with Greek tortoises 

i did not produce even a hint of any 

differentiation by these animals of the 

complex stimuli that were applied by 


conditioned-inhibitory and a condi- 
tioned-disinhibitory combination of 
stimuli within the same experiment 
and in all, or almost all, cases. Both 
"proper" and “improper” reactions 


Thus, in this case of differentiation 
of complex conditioned reflexes, just 
as in the previous case, we observed a 
similar picture of marked distinctions 
in the degree of development of the 
analytico-synthetical mechanisms of 
lower and higher vertebrates. This 
conclusion is also corroborated by 
data obtained from investigations on 
conditioned reflexes to simultaneous 
complexes of stimuli and to chains of 
stimuli. 

. It was long ago established by the 
investigations in Pavlov's labora- 
tories, as well as by other investiga- 
tions, that if a complex conditioned 
reflex is established in a dog to a com- 
bination of stimuli, for example, to 
three simultaneously acting stimuli, 
- at first both the whole complex of 
stimuli and its separate components 
Will have signal significance. But 
with the training of the conditioned 
reflex, separate components grad- 
ually lose their signaling properties, 
and only the complex of stimuli as a 
Whole can evoke a conditioned reflex 
effect. 

Investigations carried out in our 
laboratory (Voronin, 1957) on mon- 
keys (baboons), rabbits, and fish 
Showed that this process of syn- 
thesizing separate components of a 

complex of stimuli into a single, in- 
tegral stimulus is inherent in mon- 
keys, but is wholly absent in fish. 
These investigations disclosed that 
a conditioned reflex to the simul- 
taneous action of two (sound and 
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light) stimuli is established in mon- 
keys and dogs quite rapidly after 
3-20 combinations. In monkeys it is 
also easy to establish a differentia- 
tion between a complex of stimuli and 
its separate components. For this 
purpose, it is sufficient to use seven 
isolated applications of the com- 
ponents without food reinforcement. 

The establishment of such a differ- 
entiation is much more difficult when 
separate components—and not the 
complex as a whole—possess a posi- 
tive signaling property. This requires 
about 100 contrapositions of the com- 
plex of stimuli unreinforced by food 
and of its components which are ac- 
companied by the administration of 
food. 

Whereas in monkeys both variants 
of a differentiation between a com- 
plex of stimuli and its components 
could be established with greater or 
lesser difficulty, in fish they are not 
observed at all. 

Rabbits occupy an intermediate 
place between monkeys and fish; al- 
though it proved possible to estab- 
lish in them with great difficulty a 
differentiation between a complex of 
stimuli applied without food rein- 
forcement and its components ac- 
companied by the presentation of 
food, this differentiation was Very 
unstable; the animal often reacted to 
the complex of stimuli in a positive 
way. 

Motor conditioned reflexes to 
chains of stimuli were investigated in 
fish (Prazdnikova, 1955), tortoises 
(Voronin, 1957), hens, pigeons, jack- 
daws, and crows (Ovchinnikova, 
1955), dogs (Firsov, 1954), chim- 
panzees, macaques, baboons, and 
capuchins (Firsov, 1955; Prazdni- 
kova & Firsov, 1953; Shirkova, 1949). 

The scheme of a three-component 
chain of stimuli consisted in the fol- 
lowing: the first component acted 
during 5 seconds, then it was ex- 
cluded and replaced by the second 
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component which acted during the 
succeeding 5 seconds; after that the 
third component of the chain stim- 
ulus which coincided with the ad- 
ministration of food was put into ac- 
tion for a period of 10-15 seconds. 
This scheme was applied in all ex- 
periments, but the time of action of 
the stimuli was in some cases pro- 
longed, especially when slow animals, 
for example, tortoises, were investi- 
gated. 

In studying these conditioned re- 
flexes we paid special attention to the 
phenomenon which has been men- 
tioned above in connection with ex- 
periments concerning conditioned 
reflexes to a simultaneous complex of 
stimuli, namely, to the process of syn- 
thesis of separate components into a 
single stimulus. 

In order to reveal this process, we 
used to test from time to time the 
signaling properties of separate links 
of the chain of stimuli in the course 
of protracted training of conditioned 
reflexes in various animals. 

As seen from Table 9, in fish, in 
spite of frequently observed negative 
reactions to isolatedly applied com- 
ponents of chains of stimuli, in most 
cases they do not lose their signaling 
properties, even after a very long 


L. G. VORONIN 


training of a conditioned reflex to a 
complex stimulus. 

On the basis of these data we can 
state that crucians manifest a certain 
feebly pronounced tendency to dif- 
ferentiate a chain stimulus from its 
components. In tortoises such a 
tendency is out of the question, since 
only two cases (out of 30 tests per- 
formed) produced a negative effec, 
which may be accounted for by 
accidental circumstances. 

Mammals (beginning with rabbits) 
differ greatly in this respect from fish 
and tortoises. Table 10 shows that in 
the overwhelming majority of cases 
the first and second links of the chain 
of stimuli lost their signaling property 
for these animals, and only the third 
link retained it longer than the two 
others. Experiments on dogs yielded 
almost similar results, 

A comparison of the results ob- 
tained from experiments on rabbits 
and dogs with those obtained from 
investigations of the lower and higher 
monkeys revealed a striking differ- 
ence between these groups of ani- 
mals. 

Experiments with lapunder ma- 
caques showed a very rapid (in the 
course of only 3 experiments) devel- 
opment of the retardation of a reflex 


TABLE 9 
TESTS OF THE SIGNALING PROPERTIES OF SEPARA’ 
(Experiment wi 


TE COMPONENTS OF A CHAIN STIMULUS IN FISH 
th two crucians) 


Note.—The figures designate the number of applications of the chain stimul S afi 'ormed; 
the + sign denotes the presence of a conditioned reaction, the — sign denotes the pu. Nei 88 
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TABLE 10 


Tests or SIGNALING PROPERTIES OF 
Separate COMPONENTS OF A CHAIN 
Sriwutus IN RABBITS 
of 


Number Components 
rabbit I II III 
1 66+ 70— 74+ 
687 — 690 — 693— 
775 — 776+ 781— 
1204 — 1202 4- 1209 — 
1210 4- 1212 — 
3 65 — 67 — 69+ 
24 83 — 80 — 78+ 
118+ 120— 123+ 
173— 174— 175+ 
25 18+ 79+ 80+ 
109 4- 1124- 118+ 
185 — 186— 187+ 
221— 219+ 223+ 
336— 338+ 340— 
26 177 — 175— 176+ 


Re Pte and minus signs have the same meaning as 


to the first components of the chain 
of stimuli. Tests of separate com- 
ponents revealed the absence of a 
conditioned reaction after 40-60 
applications of the chain stimuli. 
The results of these experiments were 
also corroborated in experiments with 
capuchins, although these monkeys, 
which are more excitable than ma- 
caques, in most cases exhibited a posi- 
tive reaction if the components 
acted during the same period as the 
Chains of stimuli. 
San the results of experiments 
9 s , tortoises, rabbits, dogs, and 
* eys led us to the conclusion that 
Teu developed the nervous sys- 
E E the animal, the better is the 
290 D of the components of a 
unl €x stimulus effected. This con- 
de Bye based on a comparison of 
M tained from experiments with 
itioned reflexes to different kinds 
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of complex stimuli, including chains 
of stimuli. It should be borne in 
mind, however, that the process of 
synthesizing separate links of a chain 
of stimuli into a single stimulus does 
not always manifest itself quite 
clearly judging by the fact that sepa- 
rate components lose their signaling 
properties. This is apparently due 
to the specific nature of the synthesis 
of a chain stimulus: in this case each 
of the components first of all directly 
signals the appearance of the next 
component and indirectly signals the 
appearance of food. Owing to this, 
only higher animals can perceive an 
isolated application of a component 
of a chain of stimuli as a new stim- 
ulus. 

This process of differentiation isac- 
complished through the mechanism 
of the orienting reflex which is evoked 
by the difference between the action 
of the component and the usual ac- 
tion of the complex of stimuli. 

Comparative-physiological investi- 
gation of conditioned reflexes to dif- 
ferent kinds of complex stimuli is of 
double significance. First, it shows 
that the fusion of separate influences 
falling upon the organism into a 
single complex stimulus is effected 
gradually, according to the same 
principle of analysis and synthesis 
which underlies the formation of an 
elementary conditioned connection. 
Second, it allows us to make the as- 
sumption that the faculty of analyz- 
ing and synthesizing stimulations has 
developed in the course of evolution. 

It is clear that the material ac- 
cumulated by us during the study of 
conditioned reflexes to complex stim- 
uli does not give any ground for con- 
cluding that lower vertebrates do 
not possess the faculty of analyzing 
and synthesizing complex stimuli 
altogether. The fact that condi- 
tioned reflexes to complex stimuli 
were established in such animals 
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testifies to the analytico-synthetical 
activity of the higher division of their 
nervous system. But this activity 
has not yet assumed the higher form 
due to which numerous and complex 
interacting "functional combination 
centers" (Pavlov, 1938, p. 365) are 
formed in the brain structures where 
the function of coupling is effected. 

Owing to this faculty of higher 
adaptation, the organism is able, as 
it were, to “group” or “reduce” the 
infinite number of stimulations that 
fall upon it and to associate them, 
according to the principle of tem- 
porary connection, with its own vari- 
ous activities. 


ANALYSIS AND SYNTHESIS OF 
PROPRIOCEPTIVE (KINESTHETIC) 
STIMULATIONS 


When we deal with such a complex 
and highly universal external mani- 
festation of higher nervous activity 
as motor activity acquired by the 
organism in the course of its individ- 
ual life, we must possess definite 
knowledge of the analysis and 
synthesis not only of exteroceptive, 
but also Propioceptive stimulations. 

On the basis of a series of investiga- 
tions conducted in Pavlov's labora- 
tory (1938) it has been assumed (in 
general form this was expressed by 
I. M. Sechenoy as far back as 1861— 
66) that any complex movement 
learnt“ by a human being or animal 
is none other than a chain of motor 
reflexes, where the end of one reflex 
serves as a stimulus of another reflex. 
In other words, the formation of a 
chain action in the shape of a complex 
motor act is based on the analysis and 
synthesis of proprioceptive (kinesthe- 
tic) stimulations. 

Kinesthetic stimulations interact 
with the exteroceptive ones (audi- 
tory, photic, and others) and can, of 

course, associate with interoceptive 
signals. Apparently, this explains 
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the fact that the formation of two 
homogeneous! food-seeking reflexes 
to the light of electric bulbs of differ- 
ent colors applied in one and the 
same spot of the experimental cham- 
ber is equally greatly impeded in 
carps, piegons, rabbits, and dogs 
(Krushinskaya, Kholodov, Shura- 
nova, & Shcherbina, 1960). Even 
after 300-700 trainings of two motor 
reflexes, the conditioned reflexes are 
far from being appropriate in all 
cases, ie., they do not always cor- 
respond to the stimuli. For example, 
in dogs and rabbits the number of 
reactions to both manipulators in- 
creased due to the difficulty of differ- 
entiation; in fish there was an in- 
crease in the number of reactions to 
one of the two manipulators, i.e., 
there was observed a predominance of 
reactions either to the manipulator 
situated in the left corner of the 
chamber, or to that situated in the 
right corner. 

In pigeons, as well as in fish, this 
Phenomenon of predominance was 
likewise observed, but it was of a less 
Pronounced character. 

As shown by the experiments of 
of V. I. Ivanova (1960), a differentia- 
tion of two homogeneous motor re- 
flexes is established with compara- 
tive ease in fish, pigeons, and rabbits 
if the conditioned stimulus is applied 
in different spots of the chamber; 
after 50-200 applications of both 
stimuli all these animals exhibited 
Proper reactions. It was found that 
the individual specific features of the 
animals manifested themselves with 
greater force than did their phylo- 
genetic distinctions, 

In the course of formation of three 
motor reflexes essential differences 


* We term homogeneous conditioned reflex 
movements those which are effected by the 
Same groups of muscles and heterogeneous 
those which are effected by different groups of 
muscles. 
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were observed between fish, on the 
one hand, and pigeons and rabbits, 
on the other. 

In fish no complete differentiation 
of three reflexes was noted, in spite of 
500-800 applications of each of the 
stimuli; erroneous reactions took 
place in 30-40% of all cases, and 
only in separate experiments did the 
animal properly react to all three 
stimuli. 

Pigeons and rabbits proper reac- 
tions took place after 100—300 ap- 
plications of the stimuli and in only 
4-10% of all cases were erroneous 
movements observed. Absolute dif- 
ferentiations of four reflexes were 
established in pigeons and rabbits 
with great difficulty only after 400— 
600 applications of the stimuli. The 
dynamics of formation of several 
homogeneous motor reflexes in fish, 
piegons, and tortoises showed: (a) 
that the establishment of three re- 
flexes" constitutes the limit of the 
analytico-synthetic abilities of fish, 
While in pigeons and rabbits four 
homogeneous motor reflexes are 
easily established; (5) that the forma- 
tion of two and more reflexes passes 
through two stages: the stage of 
generalization, when one stimulus 
can evoke any of the established re- 
flexes; and the stage of specialization, 
when the movement is adjusted to a 
Strictly definite stimulus. 

We also investigated the dynamics 
of formation of several heterogeneous 
Motor reflexes, i.e., movements which 
differ from one another by the fact 
that they are accomplished either by 
different effectors—oral cavity or 
extremities, or by different muscles of 
the same effector—for example, when 
the subject presses a lever towards 


5 H H 
ae Experiments performed by K. A. Iordanis 
1 an that the possibilities of tortoises are 

more limited: only two homogeneous 


m * * 
EU reflexes could be established in this 


himself or in the opposite direction, or 
by a different number of muscles— 
for example, a local grasping move- 
ment by the jaws and the movement 
of the animal to a definite place in 
the chamber (Iordanis, 1958, 1959b; 
Voronin & Iordanis, 1960, 1961; & 
others). 

As may be seen from Table 11, the 
principal difference between tortoises 
and other animals consists in the fact 


TABLE 11 


CoMPARATIVE DATA CONCERNING 
DIFFERENTIATION OF 
Moror REFLEXES 


Number of experiments required 
for differentiating motor 
conditioned reflexes* 


Animals 
Two Three Four 
reflexes | reflexes | reflexes 
Pigeon 3-16 6-9 — 
Rabbit 10-17 12-57 41-44 
Dog 18-46 8-24 = 
Macaque 
(rhesus) 25-36 20-114 | 90-144 
Chimpan- 
zee 28-41 57-92 8-14 


s [n the course of experimentation 12-18 stimulations 
were applied to tortoises, 15-20 to pigeons and rabbits, and 
30-35 to dogs and monkeys. The number of applications of 
each stimulus was approximately equal. 


that in tortoises, just as in fish, a 
strongly pronounced differentiation 
of motor reflexes is observed only in 
the case of two such reflexes. In the 
case of three reflexes there is evidence 
of a disturbance of the conditioned 
reflex activity which is expressed in 
the disappearance of all previously 
established reflexes. : 

On the basis of these experiments, 
we can draw the paradoxical conclu- 
sion that two or three reflexes are 
better differentiated in pigeons than 
in rabbits, while in monkeys they are 
differentiated worse than in rabbits. 
A differentiation of four motor re- 
flexes proceeds with greater difficulty 
in monkeys. Chimpanzees constitute 


O 
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a certain exception in this respect: a 
differentiation of four reflexes is es- 
tablished in them easier than a dif- 
ferentiation of two or three reflexes. 
We find it difficult to explain this 
difference between pigeons and rab- 
bits. As to the difference between 
monkeys and other animals, they can 
be explained, apparently, by the fact 
that these animals are unequilib- 
rated. The process of excitation con- 
siderably predominates in them over 
the process of inhibition (Voronin, 
1952); as is known, this circumstance 
greatly impedes the process of differ- 
entiation of any stimuli, including 
kinesthetic ones. 

A differentiation of several motor 
conditioned reflexes is established 
much easier in the course of one ex- 
periment if each of these reflexes, in- 
independently of all others, was 
previously established and stabilized 
in various experiments. 

The investigations of Tagiev 
(1958) showed that if two condi- 
tioned reflexes, previously estab- 
lished in separate experiments, are 
“brought together,” the conditioned- 
reflex activity of the animals proves 
to be deranged only for a short time 
as manifested in a larger number of 
erroneous reactions. But even here a 
certain difference was observed be- 
tween pigeons and rabbits, on the one 
hand, and carp, on the other. 

Rabbits and pigeons exhibited 
proper reactions after only two or 
three applications of the stimuli, 
while carp showed such reactions 
only after 5-60 applications. 

A similar difference between these 
animals was clearly noted in the ex- 
periments of Chumak (1957b) who 
studied conditioned reflexes in rela- 
tion to the magnitude of stimulation. 
In these experiments, the animal had 
to perform a certain action with two 
manipulators near the larger positive 
conditioned stimulus (reinforced by 
food). 
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In fish a differentiation between 
the signals and the direction of move- 
ment was first observed after 110-118 
applications of the stimuli; it became 
stabilized after 200-210 such applica- 
tions. In pigeons the differentiation 
was first observed after 40-80 ap- 
plications, in rabbits after 60-80 
applications; the differentiation was 
stabilized in pigeons after 6-90 and in 
rabbits after 90-110 applications of 
the stimuli. 

Thus, facts obtained from the 
study of the dynamics of establish- 
ment and stabilization of several 
food-seeking heterogeneous and 
homogeneous conditioned reflexes 
show that this is a more complex 
process than the establishment of a 
single food-seeking reflex. 

A comparison of these facts with 
the previously described results of 
investigations on the properties of 
nervous processes allows us to assume 
that the phylogenetic difference in 
the analysis and synthesis of stim- 
ulations is, apparently, due to the 
different strength of the nervous 
processes of excitation and inhibi- 
tion, as well as to their lability and 
equilibrium. 

We believe that the interaction of 
excitation and inhibition lies at the 
base of any differentiation. For ex- 
ample, two or more manipulanda are 
placed in front of the animal, but 
only one signal emerges; in this case 
the animal must choose one of the 
manipulanda and perform a definite 
movement; in other words it must 
differentiate these manipulanda, and 
this, in the final analysis, means 
differentiation of movements on the 
basis of kinesthetic signals.“ It is 
quite obvious that when the animal 


* It goes without saying, that a definite role 
is played here also by stimulation of the 
receptors of other sense organs, but, as we 
shall see later, in the case of formation of 
motor conditioned reflexes kinesthetic signals 
predominate. 


| 
l 


HIGHER NERVOUS ACTIVITY 179 


effects one movement or one motor 
conditioned reflex, it inhibits an- 
other movement, another motor con- 
ditioned reflex. 

The stronger the nervous processes 
of excitation and inhibition, the more 
labile and more equilibrated they 
are, the easier is effected this co- 
ordination of movements. From this 
point of view, it becomes clear why 
fish, which possess weak nervous 
processes, and lower monkeys, in 
which the nervous processes are not 
sufficiently equilibrated, coped with 
the experimental task with greater 
difficulty than rabbits and pigeons. 
It should be pointed out that when 
we speak of weakness, insufficient 
equilibrium and poor lability of 
nervous processes, we apply these 
notions relatively to the conditions of 
the given experiments and in com- 
Parison with a higher level of develop- 
ment of the nervous system. 

We can rightly say that in any 
Species of animals, which live in ade- 
quate conditions, the properties of 
the nervous processes are sufficiently 


developed to insure their adaptation. 


to the conditions of existence through 
individual training, i.e., through tem- 
Porary connections or conditioned 
reflexes, 

It must be also borne in mind that 
although the differentiation of stimu- 
lation is feasible due to the properties 
of both excitatory and inhibitory 
nervous processes, it is the absolute 
and relative strength of inhibition 
that Plays the decisive role. Owing 
to this factor, fish which possess weak 
internal inhibition, and lower mon- 
keysin which the inhibitory process is 
Comparatively weak, i.e, animals 
With unequilibrated nervous proc- 
esses where excitation predominates 
ver inhibition, solved the experi- 
Mental task with greater difficulty 

an pigeons and rabbits. 

t the same time monkeys had an 


obvious advantage over lower verte- 


brates in another respect: while in 
fish and tortoises it proved possible to 
establish only two separate motor re- 
flexes, in monkeys the number of such 
reflexes reached four, and even this 
was not the limit of their capacity to 
differentiate proprioceptive stimu- 
lations. 

Up to now we have dealt with the 
predominant manifestation of the 
process of analysis, with the separa- 
tion of one stimulus from two or 
several stimuli. This process, how- 
ever, is indissolubly bound up with 
the process of synthesis or the process 
of combining stimuli. 

We judged the process of synthesis 
kinesthetic stimulations by the dy- 
namics of the fusion of separate 
motor acts into a single chain of 
movements. After the establishment 
of two or several stable conditioned 
reflexes we formed a chain of their 
corresponding stimuli in which the 
agents succeeded one another, each 
acting during 2 or several seconds. 

First of all, it proved that in re- 
sponse to a chain of stimuli the experi- 
mental dog performed a chain of cor- 
responding movements without the 
reinforcement of each of them by food 
(Rokotova, 1955; Voronin, 1947a). 

Similar phenomena were observed 
in experiments with pigeons, rabbits, 
and monkeys in which four condi- 
tioned stimuli were combined in a 
chain (Iordanis, 1958; Ivanova, 
1960; & others). i 

It was also found that in fish this 
process of "mechanical" synthesis“ 
of stimuli and movements is quite dif- 
ferent. When the conditioned stimu- 
lus of a separate movement was not 
single, but bore a chain character, the 


7 By this term we designate the phenom- 
enon which arises without any special elabora- 
tion of the process of combining separate 
movements into a chain; although the animal 
did not undergo any preliminary training, it 
reacts to the conditioned stimuli in the same 
sequence in which they are applied. 
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consecutive combination of such 
"simple" chains of stimuli into a com- 
plex chain did not evoke any chain of 
corresponding movements; the fish 
remained beside the bead which was 
linked with the first chain stimulus; 
it performed several movements in 
spite of the fact that the second link 
of the complex chain of stimuli was 
already in action; then the fish swam 
away (Tagiev, 1958). 

When a single agent was used as a 
conditioned stimulus of a separate 
reflex, the process of mechanical syn- 
thesis of movements proved feasible; 
however, the number of the reflexes 
combined was limited. 

Thus, experiments performed by 
Ivanova (1960) on carp showed that 
the combination of three conditioned 
stimuli into a chain evoked a chain of 
movements of the same sequence. 

"This phenomenon was not noted in 
tortoises. To a chain consisting only 
of two stimuli they reacted with re- 
peated movements corresponding to 
the first component of the chain; then 
this reaction was fully discontinued. 

In experiments with rabbits and 
pigeons, Tagiev (1958) observed a 
proper reaction to a complex chain of 
stimuli which consisted of seven ap- 
plications of two alternating simple 
chains. To each application of a 
simple chain the animal reacted with 
a corresponding movement. 

The difference between the analyt- 
ico-synthetical abilities of fish and 
tortoises, on the one hand, and pi- 
geons, rabbits, dogs, and monkeys, 
on the other, was still more obvious 
when we began to apply regularly a 
chain of stimuli consisting of three 
and more sound and light agents. 

In a series of experiments per- 
formed by Ivanova (1960) the train- 
ing of a three-component chain of 
movements led to a marked disturb- 

ance of conditioned reflexes in fish; in 
response to a chain stimulus the 
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animal in most cases reacted with. 
disorderly movements, and then 
dropped to the bottom of the aquar- 
ium. If the fish did manifest a posi- 
tive reaction to the stimuli, this re- 
action was of an unusual character: 
the fish rapidly approached the bead, 
snatched it violently and impetu- 
ously, and then as rapidly swam 
away, whereas usually the fish 
snatches the bead and turns to the 
place where the food appears (if for 
some reason the food does not appear, 
it snatches the bead again). The re- 
placement of a three-component 
chain of stimuli by a two-component 
chain led to the normalization of the 
conditioned reflex activity: the fish 
began to manifest proper reactions to 
the stimuli. 

After 50-70 applications of a chain 
stimulus there were observed some 
cases when the fish reacted with a 
two-component chain of movements 
just to the first component of the 
stimulus alone. Some of the inter- 
signal reactions* also bear the char- 
acter of chain reflexes. Thus, it is 
possible to establish in fish a system 
or stereotype consisting of two move- 
ments arising in a definite sequence to 
the first link of a two-component 
chain of stimuli. This proved to be 
the limit of the analytico-synthetical 
abilities of carp. 

It must be pointed out that the 
above-mentioned experiments of 
Tagiev on fish demonstrated not only 
the absence of any so-called mechani- 
cal synthesis of movements after a 
consecutive application of two chain 
stimuli, but also the absence of real 


* Reactions "spontaneously" arising in the 
intervals between conditioned stimulations. 
Apparently, certain elements of the surround- 
ings in which the experiment takes place, as 
well as some internal stimulation (for ex- 
ample, "hungry" blood) are the stimuli of 
such reactions. 
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synthesis* after a chronic application 
of these stimuli in a chain order. An 
attempt to establish a set of two 
movements evoked only by the first 
component of a chain stimulus re- 
sulted in a disturbance of the condi- 
tioned reflex activity; the fish swam 
away from the stimuli. 

At the time of such negative re- 
actions to a complex chain of stimuli 
there was also observed a disturbance 
of reflexes to its components—to the 
simple chain stimuli: the fish did not 
always react to these stimuli or con- 
fused” them, producing inadequate 
reactions. 

These distinctions, as well as those 
which were revealed by the experi- 
ments of Ivanova and Tagiev in the 
mechanical synthesis of movements, 
are, apparently, accounted for by the 
different complexity of the condi- 
tioned stimuli. 


* By this term we designate the phenom- 
enon of formation of a system or stereotype of 
movements which arises only to the first or 

starting" component of a chain stimulus. 


In pigeons, rabbits, dogs, and mon- 
keys the dynamics of formation and 
the length of the system of reflexes 
consisting of several heterogeneous 
motor reactions depends both on the 
individual properties of the nervous 
system and, especially, on its phylo- 
genetic peculiarities. 

Our data (Voronin & lordanis, 
1960) which were verified and supple- 
mented in subsequent experiments 
(they are presented in Table 12) 
show, that in spite of considerable 
individual distinctions, the formation 
of a motor set of three or four move- 
ments is the limit of the analytico- 
synthetical abilities of pigeons and 
rabbits. In monkeys and apes 
(higher and lower), judging by the 
dynamics of the process of formation 
of an automatized chain motor reflex 
(a motor stereotype), the ability to 
synthesize four movements is not the 
limit. 

A similar picture of synthesis of 
four homogeneous motor reactions 
into a chain of movements was ob- 


TABLE 12 


RATE or SYNTHESIS OF Two, 


THREE, AND Four MOVEMENTS 


ee 
Number of stimulations required for: 


Synthesis of two 


Animals movements 
First 1 
manifes- | Stabili- 
tation zoo 
Pigeon No. 11 15 18 
igeon No, 12 
Rabbit No. 21 2 E 
Rabbit No, 22 84 96 
Rabbit No. 23 26 58 
Rabbit No. 25 69 unstable 
Rabbit No, 29 53 73 
og “Marsik” " 108 
C acaque “Pashka” 67 89 
apuchin “Shalun” 25 90 
impanzee Lada“ 42 42 
1Mpanzee “Sultan” 27 59 


a Rai 
Failure, experiments discontinued. 


Synthesis of three Synthesis of four 
movements movements 
S222 
First _ | Stabili- 
e zation 
52* — 
94 128 
304* x 
85 225 
898 a 
169* A 
280 283 
124 125 
88 114 
91 91 
49 80 
62 66 
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served by Ivanova (1960) in experi- 
ments on pigeons and rabbits. This 
investigation showed that a firm con- 
nection was established between the 
first and second, as well as between 
the third and fourth components of 
the movements. A less stable and 
durable connection, which is, there- 
fore, not always manifest, is estab- 
lished between the second and third 
links of the chain of movements. 

Owing to this distribution of con- 
nections within the structure of a 
chain motor reflex, a chain of move- 
ments is formed best to a consecutive 
application of the first and second 
components of the chain stimulus, 
without the application of the third 
and fourth. The application of the 
first component alone seldom evoked 
all four movements, as was noted in 
the case of synthesis of heterogeneous 
movements, but the second move- 
ment always emerged without the ap- 
plication of a corresponding stimulus. 

The number of stimulations used in 
formation of an automatized chain of 
homogeneous reflexes was approxi- 
mately the same as in the case of het- 
erogeneous movements. 

Synthesis of four movements 
proved to be of less pronounced char- 
acter; this is, apparently, explained 
by monotonous kinesthetic stimula- 
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tions which vary less than in the case 
of heterogeneous movements. Owing 
to this, the exteroceptive conditioned 
stimulations assumed a greater signal 
significance; therefore, when the 
third and fourth components of the 
chain stimulus were excluded, the 
chain of movements could still be 
effected, but when the second com- 
ponent was also excluded, the animal 
experienced certain difficulties. In 
the case of heterogeneous move- 
ments, i.e., when the kinesthetic 
stimulations differ greatly, the ex- 
teroceptive stimuli are of lesser sig- 
nificance; therefore after long train- 
ing of the chain of reflexes, the latter 
is effected without any difficulty in 
response only to the first component 
of the chain stimulus. 

The predominant significance (un- 
der the conditions of our experiment) 
of kinesthetic stimulations, as com- 
pared to exteroceptive stimulation, 
clearly manifested itself in the effect 
of an inhibitory (differential) stimu- 
lus which was introduced in different 
points of the chain of stimuli before 
and after the chain of movements 
turned into a system, or became 
automatized. 

As shown by Table 13, the differ- 
ential stimulus (a red light) included 
in the chain of stimuli prior to the 


TABLE 13 


DISINHIBITION OF A DIFFERENTIAL STIMULUS DEPENDING ON THE 
SYNTHESIS or Four REFLEXES 


Sequence of application of 
conditioned stimuli 


Bell, red light (differential), white light, 
blue light, gurgle 


Bell, white light, red light (differential), 
blue light, gurgle 


Cases of manifestation of the inhibitory 
effect in percents 


Prior to the 


synthesis of Two Three Four 
the chain reflexes reflexes reflexes 
of reflexes 


64 58 28 12 
0 


| 


| 
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synthesis of the motor chain reaction 
retained its significance in 64-84% of 
all the tests performed. After the syn- 
thesis of four movements into a single 
chain reaction which is effected ‘‘au- 
tomatically" only in response to the 
first component of the chain stimulus, 
the differential stimulus evoked a 
positive effect in all or almost all of 
its action (Voronin, 1960). 

The phenomenon is the more pro- 
nounced, the closer the differential 
stimulus in the chain to the moment 
of food reinforcement. 

From all that has been said above 
concerning the synthesis of motor 
acts into a chain reaction it follows 
that the longer the training of the 
chain of movements and the closer 
the component of a chain stimulus to 
the moment of food reinforcement, 
the lesser is the role of exteroceptive 
stimulation and the greater is the 
role of kinesthetic stimulation in the 
formation of a chain motorreaction. 

This conclusion is of great impor- 
tance for the comprehension of the 
mechanism which controls the auto- 
matization of man's motor habits— 
their accomplishment, as it were— 
without the participation of any ex- 
ternal signals or consciousness. 

The question will be considered in 
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greater detail in a special section of 
this report following the discussion of 
facts dealing with the synthesis of 
more complex systems of motor con- 
ditioned reflexes established by the 
method of gradual complication. 
First of all we formed chain motor 
reflexes by the above described 
method of gradually adding new 
stimuli and movements to an already 
established reflex. 

It was found that the length of the 
chain of movements and the rate at 
which a new link is added to it de- 
pend on the phylogenetic level of the 
animal's development. For example, 
an unstable three-component chain of 
movements could be established in 
tortoises with great difficulty, while 
in pigeons, rabbits, and cats it proved 
possible to establish chain reactions 
consisting of 7-9 components (Napal- 
kov, 1958a, 1958b). 

Table 14, based on the results of 
experiments conducted by a number 
of our collaborators (Napalkov, 
1958a; Napalkov & Verevkina, 1959; 
Shirkova & Verevkina, 1960; & 
others), shows that in tortoises the 
formation of a new link in a chain of 
movements requires a considerably 
larger number of combinations of the 
new signal and of the new movement 


TABLE 14 
RATE or FORMATION oF COMPONENTS IN A CHAIN MOTOR REFLEX 


Number of combinations req 


uired for the formation of 


conditioned reflexes (average figures) 


eee : Two- Three- Four- Five- 
Sous component component component component 
rer reflex reflex reflex reflex 
25577. FRE | 
Pees B 8 58 116 ni. 30 
Rabbi 10 16 15 2 2 
Rage. 15 18 14 18 21 
Cat 9 12 13 " 18 
Bak iu D 15 12 14 
a H H 
inermes 3. (00d Manes Uu te 


impanzee 
an 2 2 


1 1 
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with the previously established re- 
flexes than the establishment of any 
preceding link. In pigeons this phe- 
nomenon is less pronounced than in 
tortoises; rats and rabbits manifest 
only a certain tendency towards it, 
while in cats and dogs it is fully 
absent. 

In monkeys the formation of a new 
link in a chain of movements proceeds 
very rapidly—after only two combi- 
nations. It should be noted, however, 
that the stabilization of the whole 
chain takes place after 2-59 combina- 
tions, i.e., in a number of cases it lasts 
as long as in other mammals (Shir- 
kova & Verevkina, 1960). In man, 
chains of motor reflexes are formed 
with striking speed. After the experi- 
ments the subjects said that they 
at once grasped the method of solving 
the task as stipulated in the pre- 
liminary instructions. 

A conditioned inhibitor can be 
established to a chain of motor re- 
flexes in man and animals. This 
means that if the conditioned stimuli 
of a basic chain of movements” are 
applied during the action of an in- 
different agent and the chain of re- 
flexes arising in response to them is 
not reinforced, then after a series of 
such negative combinations the ani- 
mal or man ceases to react to the 
conditioned stimulation. In other 
words, a definite agent turns in a 
signal for the absence of reinforce- 
ment, i.e., becomes inhibitory. 

Table 15 shows that phylogenetic 
distinctions cannot be shown in rab- 
bits, cats, and dogs on the basis of 
the rate of formation of a conditioned 
inhibitor. Such an inhibitor is estab- 

lished much more rapidly in lower 
monkeys (macaques, baboons) and 
very rapidly in man. 

It was further found that the ex- 


10 By this term we designate a chain of 
movements which in animals is reinforced di- 
rectly by food and in human subjects by be- 
forehand stipulated signals. 


TABLE 15 


RATE OF FORMATION OF A CONDITIONED 
INHIBITOR TO A CHAIN OF 
Motor REFLEXES 


Minimum and maximum 


Animals number of combinations 

Pigeon 21-48 
Rabbit 18-68 
Cat 36-60 
Dog 15-84 
Monkey (macaque, 

baboon, chimpanzee) 2-12 
Man 2-6 


clusion of this inhibitory signal may 
serve as a basis for the formation of a 
new chain of motor reflexes. For ex- 
ample, a human subject or an animal 
performs an accidental or induced 
movement; at the same time the con- 
ditioned signals which must evoke 
definite movements leading to the ac- 
complishment of the given aim begin 
to act (in the case of the animal it is 
food, while in the case of man it is the 
emergence of a signal confirming the 
correctness of the reaction); under 
these conditions the exclusion of the 
inhibitory agent turns into a factor 
reinforcing certain new movements. 
Thus it is possible to establish a single 
or chain motor reflex which elimi- 
nates the conditioned inhibitor or dis- 
inhibits the basic chain of alimentary 
conditioned reactions. 

Such a disinhibitory chain of move- 
ments, consisting of two or three 
components, could be established by 
us in all animals except tortoises. As 
a result of our persistent attempts to 
establish similar reactions in tor- 
toises, the latter ceased to react to 
the conditioned stimuli altogether. 

Our investigations disclosed a defi- 
nite process of formation of dis- 
inhibitory chains of movements in 
man and animals based on the ex- 
clusion of the conditioned inhibitor; 
this process is in principle similar to 
the process of formation of chains o 
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TABLE 16 
RATE OF FORMATION OF LINKS OF A 
DisiNHIBITORY CHAIN OF MOVEMENTS 
(Average figures) 


Number of combinations required 


for the formation of: 
Ant The first The second The third 
Animals | iint of a link of a link of a 
chain of chain of chain of 
move- move- move- 
ments ments ments 
Pigeon 75 84 120 
Rabbit 28 31 50 
Dog 18 28 — 
Monkey 14 19 26 
Man 6 2 1 


movements which are directly re- 
inforced by food. 

Table 16 shows that the rate of for- 
mation of a disinhibitory chain of 
movements steadily increases from 
pigeons to man. 

As to phylogenetic distinctions in 
the rate of formation of separate links 
ina disinhibitory chain of movements 
(similar to that which was revealed 
during the formation of a chain of 
movements directly reinforced by 
food) they were not so strongly 
pronounced. 

a may be said that in pigeons a 

Tee-component chain of movements 
was formed with great difficulty; 
with the “addition” of each new link, 
the stabilization of the chain became 
more and more impeded. 
uk lawful phenomenon was also 
in 11 in mammals. In monkeys 
FUN ich disinhibitory chains of 
caras are usually rapidly 
Vis ; considerable difficulties were 
ze ‘times observed. Only in man, 

i © guesses” or “grasps” the princi- 
"Peng a given task, the forma- 
1 B subsequent link of the 
ch lisinhibitory movements did 
e any effort. A complex 
1 of motor reflexes established 
ke manner has a common re- 

cement (food in the case of ani- 


mals, and a stipulated signal in the 
case of man); besides, separate links 
and parts of this system have their 
own reinforcement. Thus, the dis- 
inhibitory chain of movements is re- 
inforced by the exclusion of the con- 
ditioned- inhibitory agent, while each 
link of the basic and disinhibitory 
chain is reinforced by putting into 
action the signal of the subsequent 
movement. 

It was found that in experiments 
with human subjects this complex 
system could easily be extended ow- 
ing to the formation of conditioned 
inhibitors of second and third orders 
and of disinhibitory chains of move- 
ments consecutively eliminating 
these inhibitors (Voronin & Napal- 
kov, 1960). 

Figure 1 presents a schematic dia- 
gram of the structure of such a sys- 
tem of motor conditioned reflexes in 
man consisting of a basic (Ai-Bz-Cs- 
D.) chain of movements, conditioned 
inhibitors to it (Ji-Je-Js), and dis- 
inhibitory chains of movements 
(M;-M;-Or, La-Re-Sio, and VII-Wie), 
as a result of which the conditioned 
inhibitors are eliminated. The experi- 
ment proceeds as follows: the subject 
is given the task of switching on an 
electric bulb (P); for this purpose, the 
subject, acting on his previous experi- 
ence, performs a chain of movements 
(Ai-Bz-Cs-D 4) ; but if the conditioned 
inhibitors are in action (bulbs of other 
colors situated in definite points of 
the signal board), these movements 
do not lead to the solution of the task. 

The subject switches on the con- 
ditioned stimuli in the same sequence 
in which they were established. He 
performs first the third chain of dis- 
inhibitory movements (Viur-Wi), 
the second (Ls-ReSio) and, finally, 
the first (Ms-Ne-O7). After that he 
performs the basic chain of move- 
ments (A1-B2-C3-D 4), and thus finally 
solves the task, i. e., switches on the 


bulb (P). 
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Fic. 1. Scheme illustrating the system of motor reflexes. (Explanations to be found in the text.) 


We tested the possibility of estab- 
lishing similar conditioned inhibitors 
and disinhibitory chains in repre- 
sentatives of our series“ of verte- 
brates. We found thata disinhibitory 
chain of movements is not formed in 
tortoises at all. In pigeons, rats, and 
dogs such a chain can be formed, but 
it is only of a primary character. In 
these animals it proved impossible to 
establish a conditioned inhibitor to a 
disinhibitory chain of movements 
and to form, on the basis of its ex- 
clusion, a second disinhibitory chain, 
ie, a phenomenon of a secondary 
character. 

The experiments of Shirkova and 
Verevkina (1960) showed that these 
temporary connections can be formed 
in monkeys (macaques, baboons, 
chimpanzees). This task, however, 
proved to be extremely difficult for 
monkeys, owing to which some ex- 
hibited neurotic states in the form of 

a disturbance of the entire condi- 
tioned-reflex activity. As stated 
above, conditioned inhibitors of third 
order are established in man; judging 
by their rapid formation, they do not 
constitute the limit of the subject's 
possibilities. 
After the experiments we used to 


ask the subjects how they had man- 
aged to solve the given task; they 
answered that they had guessed that 
it was necessary consecutively to 
switch off those signals during the 
action of which the task could not be 
solved, i.e., the red light could not be 
switched on. It is quite clear that the 
subject was guided by the principle 
which underlies the structure of the 
system of signals and which was 
grasped by him in the course of er- 
perimentation; owing to the subject's 
capacity to generalize facts, he can 
solve even more complex tasks on the 
basis of this principle. In animals, 
owing to the absence of speech or of 
the “second signaling" system (ac 
cording to the terminology of Pav- 
lov), such generalization is impossi- 
ble. The animal cannot therefore 
"guess" the principle on which the 
structure of the system of signals, as 
well as of the reaction, is based; con- 
sequently, it cannot make use of this 
principle when solving the same but 
somewhat more complicated task. 
It is noteworthy that monkeys dif- 
fer greatly in this respect from other 
animals. They were able to solve 
more complex tasks, apparently, not 
because of their capacity to guess in 
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the way man does, but by virtue of 
their more developed "first signaling 
system," i. e., their capacity to im- 
press more complex systems of tem- 
porary connections called forth by 
concrete (and not abstract) stimuli. 

Owing to the highly developed ca- 
pacity of monkeys to synthesize com- 
plex systems of external and internal 
stimuli, their concrete or object 
thinking differs greatly from that of 
other animals. At the same time this 
difference does not have a qualitative 
character; it is based on the capacity 
to form a wider range of temporary 
connections and on a higher degree of 
development of the integrative func- 
tion of the brain. These capacities, 
which arose in our ancestors as far 
back as the prehistoric period, appar- 
ently, constituted the physiological 
precondition for the development of 
abstract thought. 


CONCERNING THE MECHANISMS OF 
CHAIN MOTOR REACTIONS 


_ Two kinds of chain reactions stud- 
led at our laboratory are experi- 
mental models of those phenomena 
which are called “motor habits" in 
everyday life. Numerous physiologi- 
cal and psychological researches have 
€en devoted to the study of these 
habits, and yet up to now no common 
E of their mechanisms has been 
d. ed. Some researchers refer to 
V bip theory of Sechenov and 
ss 10 and believe that the function 
im ; untary movements is subject to 
ee: aws of conditioned reflex and 
Yes any elementary or complex habit 
E case of this reflex. Other 
bilit bo positively deny the possi- 
us Pb gud these phenomena of 
kno e motor activity within the 
bein the principle of temporary 
te ions. Owing to this, they re- 
e habits as phenomena 
DE lvely different from motor 

Nditioned reflexes. 

We adhere to the first point of 


view, and therefore are not inclined 
to oppose the psychological notion of 
motor habit to the physiological no- 
tion of motor conditioned reflex, 
Essentially these two notions are 
identical denoting one and the same 
phenomenon, but from different 
aspects. The physiological aspect 
concerns mainly the nervous mecha- 
nisms of this phenomenon, while the 
psychological aspect emphasizes its 
integrity and its general significance 
for the organism. 

It is quite obvious that both above 
mentioned cases of chain motor re- 
actions result from the consecutive 
combination of motor acts into a 
single chain of movements, i.e., from 
the analytico-synthetical process. It 
is likewise obvious that the formation 
of a motor reflex to an external stimu- 
lus in an animal is a phenomenon of 
analysis of exteroceptive and proprio- 
ceptive stimuli and of their synthesis 
into a single functional "combination 
center." 

In man this functional combina- 
tion is of a more complex character. 
It includes also the functional com- 
bination center which is formed in the 
area of the second signaling system. 
At the same time these "centers" are 
created in both signaling systems ac- 
cording to the same principles. 
Whereas in one case it is an external, 
directly acting agent; in the other— 
itisa word heard or seen, pronounced 
mentally or aloud, a word which 
designates the given agent. 

In the first signaling system there 
takes place a combination of excita- 
tions coming from an external agent 
and from movements of the skeletal 
muscles, while in the second signaling 
system it is the excitations coming 
from speech and from the movements 
of the speech-motor apparatus that 
become fused. 

As a result of the synthesis of sepa- 
rate motor acts into à chain of move- 
ments (each of these acts arises in re- 
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sponse to a definite stimulus) and 
after their training this chain reaction 
is evoked by the action of the first 
stimulus alone. Such a motor set or 
automatized chain of movements is, 
as it were independent of all other 
chain stimuli; in man it is also inde- 
pendent of the second signaling 
system. 

This phenomenon is, apparently, 
accounted for by the fact that as a 
result of constant training, excitation 
concentrates in corresponding nerv- 
ous pathways according to the princi- 
ple of negative induction and inhibits 
the exteroceptive“ and second sig- 
naling excitation (Figure 2). At the 
same time the action of the first stim- 
ulus and its reflection in the second 
signaling system acquire a “starting” 
faculty and these are not subjected to 
inhibition. 

Inhibition, this arising in the 
structure of the motor reflex, dis- 
appears at once if there emerges an 
orienting reflex evoked, for example, 
by a violation of the fixed application 
of the stimuli. This explains why in 
the presence of unusual signals on the 
control panel a human subject stops 
his stereotyped movements in order 
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to find out the cause of such a viola- 
tion in the sequence of signals. 
When a chain motor set is formed 
by consecutive addition of new links 
of a chain of stimuli and movements 
to links already established, the sig- 
nificance of the external stimuli is 
more pronounced as it is almost con- 
stantly observed. This is, apparently, 
explained by the fact that in this case 
the external stimuli play a double 
role: they reinforce the movement al- 
ready effected, at the same time sig- 
naling the subsequent movement. As 
to stimulations of the second signal- 
ing system, they become more or less 
inhibited with the training of the 
chain of movements; when perform- 
ing his movements, the subject is 
guided predominantly by exterocep- 
tive and kinesthetic stimulations. 
These two kinds of stimulations may 
not be reflected in the second signal- 
ing system until their usual sequence 
is changed and an orienting reflex 
emerges (Ivanova, 1960). This in- 
fluence of the orienting reflex testifies 
to the fact that the exteroceptive and 
kinesthetic stimuli were reflected in 
the second signaling system at the 
beginning of the formation of a chain 


Fr, 2. Scheme illustrating an automatized chain motor reflex. (ABC combination “centers” 


of the links of a chain of conditioned reflexes, 


avg NUES P —reinforcement of the chain of movements. l. 
Combination centers of “first-signaling” system excitation: a 


ical 
=exteroceptors, ai =cortica 


focus of exteroceptive excitation; b =proprioceptors, bi=cortical focus of proprioceptive 
excitation; = verbal stimulus, ci cortical focus of verbal stimulation; d = proprioceptors 9 
speech organs, d; —cortical focus of kinesthetic stimulations arising in speech organs. 2. Com- 
bination centers of "'second-signaling" system excitation.) 
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of motor reflexes; but later with the 
training of the reflexes, they became 
inhibited. 


THE CoNCEPT OF REINFORCEMENT 


In connection with the question of 
the mechanisms controlling the for- 
mation of chain motor reflexes it is 
important to have a clear idea of 
what we imply by the concept re- 
inforcement'“ and what its physio- 
logical mechanism is. 

In our experiments with animals 
food served as a reinforcement, while 
in experiments with human subjects 
this role was played by a signal which 
confirmed the correct solution of the 
task. In both cases, in the course of 
formation of a chain of reflexes each 
preceding movement was reinforced 
by the stimuli of the subsequent 
movement. According to the point of 
view adhered to in Pavlov's labora- 
tories, the coupling of a temporary 
connection results from the blazing of 
à nervous pathway in the place where 
the irradiating excitations from the 
conditioned and unconditioned corti- 
cal centers meet. This meeting of the 
irradiating excitations is possible if 
both centers are activated simultane- 
ously. Then the conditioned excita- 
tion can reach the cortical representa- 
ton of the unconditioned center only 
when a focus of excitation arises in it 
which according to this point of 
View, plays a reinforcing role. 

It may be assumed that any kind 
ol excitation—conditioned, uncon- 
e orienting, alimentary, de- 
€nsive, etc.—has a definite physico- 
1 expression. This assump- 
Yi agrees with Pavlov's point of 
his 1 anced by him in his lectures 
9 ack as 1912, and also ina 
5 8 of his subsequent articles, ac- 
9 d to which the quality or prop- 
pen the stimulus determines the 
in Pas er of the nervous process or, 
m ov s terminology, the “specific 

urse of stimulation.“ 


Although at present we do not 
possess any factual data concerning 
the physicochemical foundations of 
the innermost mechanisms involved 
in the coupling of temporary connec- 
tions, it does not mean that we are 
unable to form certain ideas of these 
mechanisms by noting the results of 
experimental conditions on the or- 
ganism and its responses to them. 
We may, for example, assume that 
the brain models the external influ- 
ence in conformity with the various 
parameters of this influence, such as 
its strength, duration, frequency, etc. 
(Voronin & Sokolov, 1960; Sokolov, 
1960). The nervous model is created 
by the group of neurons which store 
information concerning the proper- 
ties of the stimulus. If, for example, 
the orienting reflex has been extin- 
guished, but one of the parameters of 
its stimulus has changed, then the 
nervous model, formed as a result of 
the excitation of the reflex, does not 
coincide with the new supply of infor- 
mation. Owing to this noncoinci- 
dence, there arise impulses of Dis- 
coordination" spreading over the de- 
scending cortico-reticular pathways 
evoking an orienting reflex which 
“washes off" the previously formed 
nervous model. Subsequent events 
develop according to the law of the 
orienting reflex; in case of its extinc- 
tion (as a rule, this extinction is in- 
evitable) a new nervous model is 
formed and it exists as long as the 
conditions of its emergence persist. 

The mechanism of formation and 
reinforcement of a conditioned reflex 
may be presented in the following 
way: when two excitations of differ- 
ent codes coincide in time, one of 
them, which is the stronger reshapes 
the physicochemical structure of the 
other, and this leads to the creation of 
a single system of excitation. This 
system persists as long as the excita- 
tion coming along the afferent path- 
ways can flow coordinatedly into the 
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stream of nervous impulses which are 
of a related nature. If this does not 
occur, the nervous process becomes 
inhibitory. 

The process of automation of chain 
motor reactions takes place asa result 
of constant training of the nervous 
connections in the area of the kines- 
thetic analyzer; this ensures an un- 
impeded irradiation of the nervous 
process in them, irrespective of the 
nervous connections of other analyz- 
ers entering the structure of the given 
system of a complex reflex. For some 
reason this predominance of kines- 
thetic stimulation evokes a state of 
slight inhibition in the area of the 
second-signaling system connections, 
as well as connections between the 
movements and direct, first-signaling 
system stimulations; at the slightest 
violation of the system of stimuli this 
inhibition is washed off by the orient- 
ing reflex that arises. Hence, two 
states of the nervous processes—exci- 
tation and inhibition—are created in 
each of the "functional combination 
centers" which are presented in our 
Scheme (Figure 2). It is the equi- 
librium of these two states which 
underlies the development of an 
automatized chain motor reflex. 


GENERAL CONCLUSIONS 

Our comparative-physiological in- 
vestigations of higher nervous activ- 
ity now permit us to draw the follow- 
ing two general conclusions, First, 
the peculiarities of higher nervous 
activity in vertebrates of different 
phylogenetical levels are based on the 
quantitative growth and complication 
of the conditioned reflex mechanisms; 
but the principle of the organism's 
interaction with the environment is 
identical in all cases. Individual ex- 
perience and individual training of 
both the lower and the higher verte- 
brates are due to the establishment of 
connections between the concrete in- 
fluences of the environment and the 
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various activities of the organism. 
The number of these connections in- 
creases with the development of the 
nervous system, their structure as- 
suming a more and more complex 
nature. Considering the same process 
from a psychological point of view, 
we may say that concrete or object 
thinking of animals becomes enriched 
with the development of their nerv- 
ous system. 

Even in the case of monkeys whose 
higher nervous activity is so greatly 
developed, there are no grounds for 
assuming the presence of any human- 
like mentality in them, or of any rudi- 
ments of such mentality. It may 
rather be said that human mental 
activity has preserved certain “re- 
mainders" of the animal intellect in 
the shape of concrete, image bear- 
ing, object thought but this is usu- 
ally, dominated by abstract, verbal 
thought. 

Second, the basic mechanism of 
higher nervous activity, which devel- 
oped in the process of evolution of the 
animal world, is a mechanism of 
analysis and synthesis of stimulation 
falling on the organism, as well as of 
the activity arising in response to it. 

On the basis of the factual material 
obtained by myself and by other re- 
searchers I am able to mark out 
three stages in the evolutionary de- 
velopment of higher nervous activity 
of animals (Voronin, 1960a). The 
distinctive features of each of these 
Stages are: the state of perfection of 
the analytico-synthetical mechanism 
of higher nervous activity and corre- 
lation between its acquired and in- 
born components. 

The first stage includes the higher 
nervous activity of animals possess- 
ing a primitive and coarsely different- 
iated nervous system. The behavior 
of these animals, just as that of 
present-day coelenterata, worms, and 
arthropoda, was determined predomi- 
nantly by inborn reactions. At this 
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stage the role of reactions acquired 
during the individual life of the or- 
ganism was insignificant. The gen- 


'eral principle underlying the work of 


physiological mechanisms concerned 
with acquired activity does not, ap- 
parently, differ from the principle of 
temporary connections which are 
formed in modern animals. A low 
degree of development of the prop- 
erties of the nervous processes and 
the capacity to retain nervous traces 
made possible formation only of ele- 
mentary conditioned reflexes of a gen- 
eral nature such as locomotion in re- 
sponse to coarse signals which indi- 
cate circumstances favorable or un- 
favorable to the organism. This 
phase of development of higher nerv- 
ous activity may be designated as a 
phase of simplest analysis and 
synthesis of elements of the organ- 
ism’s environment. 

At the second stage of its develop- 
ment higher nervous activity was, ap- 
parently, of the same character as 
that in modern fish, amphibians, and 
reptiles. At this stage the analytico- 
synthetical mechanisms control more 
delicate individual relations between 
the organism and its environment, 
although the specific, hereditary 
mechanisms of these relations still 
play a predominant role. However, 
in this group of animals the range of 
temporary connections is greater, and 
this considerably increases the role of 
A eus adaptive reactions. With 
: € development of the locomotor 
ENS nies and their central co- 
ae: the local motor reactions 
the aM à more important place in 
os ystem of conditioned reflex ac- 
Brei The development of the 
labilin j equilibrium and, especially, 
1 0 da the excitatory and inhibi- 
i A cune enables the organism 
tiran 1 » its reactions within a com- 
os ‘ively short space of time in con- 

mity with the changes that arise in 

€ environment. 


The third stage relates to that kind 
of higher nervous activity which is in- 
herent in modern birds and mam- 
mals. A high degree of development 
of this activity in the ancestors of 
modern anthropoids and man played 
a particularly important role at this 
stage. These animals were distin- 
guished from the rest of the animal 
world by their highly developed 
analytico-synthetical mechanisms, 
just as higher monkeys are distin- 
guished by it at the present time. 

At the third stage of evolutionary 
development of the higher nervous 
activity an ever-increasing role in the 
behavior of the organism is played by 
reactions acquired in the course of in- 
dividual life. 

At this stage not only hereditary- 
unconditioned reactions, but also re- 
actions acquired in individual life 
cause the formation of temporary 
connections. Complexes of condi- 
tioned-unconditioned reflexes may 
originate complex multistage tem- 
porary connections. These nervous 
connections are utilized not only in 
the condtions in which they were 
formed, but also in new more or less 
similar conditions; in other words, 
the phenomenon of so-called trans- 
fer of experience" takes place. 

Thus, the third stage of develop- 
ment of higher nervous activity in 
animals is characterized by a high 
level of analytico-synthetical mecha- 
nisms, by the faculty of the neuronic 
apparatus to "impress" stimulations, 
as well as by a protracted retention of 
nervous traces, lability of the neu- 
rons, specialigation of the analyzers 
and at the same time their integra- 
tion. All this insured the finest in- 
dividual adaptation of the organism 
(by means of temporary connections) 
to changing environmental condi- 
tions. 

The above mentioned peculiarities 
of the third stage of development of 
higher nervous activity are more or 
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less manifest also at the second and 
even first stages. It may be said that 
the third stage includes the first and 
second ones, but differs from them, 
above all, quantitatively. 

Comparing, for example, the re- 
sults of our investigations on anthro- 
poids and other vertebrates we can by 
no means state that there exists a 
specific qualitative difference be- 
tween them. The point is that one 
and the same principle of individual 
adaptive reactions is of different sig- 
nificance on various levels of phylo- 
genesis. 

At the third stage a high degree of 
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development was reached by that 
aspect of higher nervous activity 
which can be considered as concrete 
or image bearing thought. 

At the level of man the qualita- 
tively new signaling mechanisms, 
which were designated by Pavlov as 
the second signaling system of reality, 
gave rise to a new type of thoght, 
namely, abstract thought. This type 
is indissolubly bound up with con- 
crete, image bearing thought, on the 
basis of which it arises and over which 
it dominates; it is of primary impor- 
tance in man's relationships with the 
conditions of his existence. 
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The aim of this article is to assess 
the extent to which the various theo- 
ries about the personality of epileptics 
have been affirmed or refuted, and to 
discuss some of the more important 
variables which should in the future 
be investigated by those working in 
this field. The complexity of the 
theoretical and methodological prob- 
lems involved is also pointed out. 


THEORIES 


The theories discussed here are for 
the most part generalizations of par- 
ticular clinicians’ experiences, with 
little attempt to relate them to a 
wider body of knowledge. Five basic 
theories may be distinguished as 
follows: 

1. That all or most epileptics share 
a charactristic personality. This 
was the original theory advanced in 
the nineteenth century by Falret and 
Féré, and extensively held on the 
Continent. It still has advocates, 
especially in Germany. The person- 
ality traits considered characteristic 
of epileptics have been variously de- 
scribed by different authors. There is 
Some agreement on a basic syndrome 
of perseveration and viscosity in both 
the intellectual and affective spheres. 
Emotional explosiveness has also 
been named as a central trait by 
most authors, either co-existing with 
viscosity (Minkowska, 1946) or re- 
placing it (Clark, 1918; Stauder, 
1938). Traits such as suspiciousness, 

religiosity, meticulousness, and sel- 


In receipt of a grant from the Medical 
Research Council. 

I am indebted to D. A. Pond and JE HE: 
Margerison for many helpful discussions. 


196 


fishness have been emphasized by 
others. According to these writers, 
the epileptic personality and the pre- 
disposition to convulsions are con- 
stitutionally determined. American 
writers who have recognized a char- 
acteristic if not universal epileptic 
personality have usually attributed it 
to the frustrating environment and 
the social stigma to which the epilep- 
tic is exposed, or to the effect of brain 
dysfunction on the personality, or to 
both these factors (Notkin, 1928; 
Revitch, 1955). 

2. That there is no characteristic 
epileptic personality, and the same 
range and combination of personality 
traits may be found among epileptics 
and nonepileptics (Lennox, 1944). 
This theory was widely held in Amer- 
ica during the “thirties” and for- 
ties," particularly by clinicians whose 
main experience was with private 
patients. À 

3. That there is no characteristic 
epileptic personality or personality 
disturbance, but a higher proportion 
of neurotic disturbance is foun 
among epileptics than among non- 
epileptics (Bridge, 1949). : 

4. That there is no characteristic 
epileptic personality, but epileptics 
tend to have a personality resembling 
that of patients with organic lesions, 
which differ from that of normal 
persons. This theory has been par- 
ticularly favored by clinicians work- 
ing with epileptic children (Bradley: 
1951). 

Each of these theories has been 
advanced largely on the basis o 
clinical observation, but in each case 
has been supported by evidence from 
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more objective studies, which will be 
considered in the sections on Ror- 
schach Studies and Studies Other 
than Rorschach Studies. 

5. That there is no characteristic 
personality common to all or most 
epileptics, but different types of per- 
sonality are associated with different 
types of epilepsy. This is not a new 
theory—as far back as 1938 German 
authors were describing two distinct 
personality types, associated with 
idiopathic and symptomatic epilepsy. 
Since about 1950, however, with the 
delineation of, and growing interest 
in temporal lobe epilepsy, it has 
assumed a new importance. Tem- 
poral lobe epilepsy is now generally 
held to be associated with personality 
disturbance, and many of the un- 
pleasant traits formerly attributed to 
epileptics in general are now said to 
characterize temporal lobe epileptics. 
In some neurosurgical centers tem- 
proal lobectomy is performed pri- 
marly in order to alleviate the person- 
cid disturbance of these epileptics. 

ome writers also describe a charac- 
teristic personality pattern found in 
een with petit mal epilepsy, and 
: astaut and his colleagues have 
ecently delineated an idiopathic 
epileptic personality type. 
a theory that differences in per- 
È diffe. among epileptics are related 
m. OR type of epilepsy has, 
i e of temporal lobe epilepsy 
E een related to the results of 
e OS M work with animals and 
Du results of surgery. Its advo- 
x ee out that bilateral ablation 
E. rh inencephalon in monkeys 
15 ce increased activity and docil- 
T & Bucy, 1939) hence, the 
EURO ity and aggressive out- 
ee the patient with temporal 
ae 18 may be considered to 
SEDE a state of excitation of 
Heute encephalon. Moreover, some 
Surgical centers report a specific 
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decrease in aggressive behavior after 
temporal lobectomy for epilepsy (Ala- 
jouanine, Nehlil, & Houdart, 1958; 
James, 1960). This theory is thus 
more than a clinical generalization, 
and has a rather different status from 
thoseoutlined above. For thisreason, 
as well as for its therapeutic implica- 
tions, and because it has received 
little attention in psychological jour- 
nals, the clinical evidence on which it 
is based will be briefly described. The 
evidence from psychological tests will 
be discussed in the sections on Ror- 
schach StudiesandStudiesOther than 
Rorschach Studies. 

The clinical evidence includes in- 
cidental observations, systematic 
studies of series of patients, and ob- 
servations of the large proportion of 
patients with temporal lobe epilepsy 
amongst epileptics in mental hos- 
pitals. Typical of the incidental ob- 
servations is Robertiello's (1953) 
description of the child with psycho- 
motor epilepsy as "usually coopera- 
tive, good, quiet and overcontrolled, 
but with episodes of impulsive and 
often violent and destructive be- 
havior." Peterman (1953) describes 
him as a child with "an abnormal 
personality, with recurring episodes 
of behaviour disorder,” in contrast to 
the child with petit mal epilepsy, who 
is usually "mentally precocious, alert, 
sensitive and temperamental.” Pond's 
contrast is between the aggressive 
child with temporal lobe epilepsy, 
and the timid passive child with petit 
mal epilepsy (Pond, 1952, 1961). 

More systematic studies have re- 
ported the incidence of personality 
disturbance among temporal lobe 
epileptics. Hill (1957) reported that 
50% suffer from personality disorders, 
and of these 25% have psychotic 
episodes. Gibbs (1958) stated that 
about 40% of the patients with psy- 
chomotor epilepsy have severe per- 
sonality disorders, and of these about 
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one third are classifiable as psychotic. 
Gastaut and Gastaut (1951 unpub- 
lished) found that 5295 of psycho- 
motor epileptics attending outpatient 
clinics have psychiatric disorders and 
Bingley (1958) identified personality 
changes in 52% of epileptics with 
temporal lobe foci. Vislie and Henrik- 
sen (1958), reviewing the psychiatric 
symptoms of 162 epileptics, excluding 
psychotics, found a tendency for a 
higher incidence of neurotic symp- 
toms when the localization signs 
pointed to the temporal region. The 
only comparable study with children, 
separating personality disturbance 
from intellectual defect, is that of 
Glaser and Dixon (1956). They found 
that 19 out of 25 children with psy- 
chomotor seizures had interictal per- 
sonality disturbance. 

Two recent studies have reported a 
large proportion of patients with 
temporal lobe epilepsy among epi- 
leptics in mental hospitals. Liddell 
(1953) reported an incidence of 50% 
and Roger and Dongier (1950) 64%, 
as compared with 30% in a series of 
outpatient epileptics (Gastaut, 1950). 

The nature of the personality dis- 
turbance has been variously de- 
scribed. Gibbs (1958) found no spe- 
cific symptoms, but Mulder and 
Daly (1952) stressed the frequency of 
anxiety and depression. Falconer, 
Hill, Meyer, and Wilson (1958) found 
explosive or persistent aggressiveness 
the commonest symptoms, as did 
Liddell (1953) and Roger and Dongier 
(1950) in their surveys of epileptics 
with temporal foci in mental hos- 
pitals. Paillas (1958) particularly 
noted the frequency of slowness, 
adhesiveness, and perseveration as 
well as aggressiveness, and Bingley 
(1958) found the commonest syn- 

drome was "adhesiveness in the in- 
tellectual, emotional and volitional 
spheres." Vislieand Henriksen (1958) 
however found these traits were asso- 
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ciated with evidence of diffuse lesions 
and organic dementia rather than 
temporal localization. 

There are two main difficulties in 
evaluating these studies. In the first 
place, there appear to be important 
differences between the epileptic 
populations studied. In some studies, 
mainly patients with pronounced 
psychiatric symptoms were included 
(Falconer et al., 1958; Liddell, 1953; 
Mulder & Daly, 1952; Roger & 
Dongier, 1950). In some studies the 
criterion for selection was purely 
clinical (Glaser & Dixon, 1956); in 
others, electrophysiological (Bingley, 
1958; Vislie & Henriksen, 1958); 
whilst in still others a combination of 
clinical and electrophysiological cri- 
teria was used (Mulder & Daly, 1952; 
Gastaut et al., 1955). Since electro- 
clinical correlations are far from per- 
fect, different populations may have 
been sampled. This point is discussed 
further in Section 4. 

Secondly, no study shows an ade- 
quate appreciation of the problems of 
bias and reliability in the judgments 
made. Precautions are not reported 
to prevent contamination of EEG 
interpretation by clinical data and 
the reliability of the EEG and clinical 
diagnoses is not assessed. Bingley's 
is the only study which describes 
reassessment of the EEG records by 
another judge, without knowledge of 
the previous interpretation of the 
Patient's clinical symptoms. The 
Position is worse in respect of the 
assessment of the presence or type of 
Psychiatric disturbance, This is left 
undefined in all studies and is never 
made without knowledge of clinical 
status. Nor is the reliability of the 
judgments assessed, 

Onestudy of greater methodological 
sophistication than the others is that 
of Nuffield (1961). The EEG records 
of 233 epileptic children who had 
attended the Maudsley Hospital dur- 
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ing the previous 10 years were classi- 
fied into 7 electrophysiologicalgroups. 
Each child was rated for aggressive- 
ness and neurotic manifestations on 
the basis of the answers which had 
been recorded on the standardized 
psychiatric case history to such ques- 
tions as "Is he irritable, a bully, 
timid, sensitive?" The mean “‘aggres- 
sive" score derived from these ratings 
of the children with a temporal lobe 
EEG focus was higher than that of 
any other EEG group, while their 
mean "'neurotic" score was the lowest. 
Conversely, the mean aggressive 
score of the 3/sec SW group was the 
lowest, and their mean neurotic score 
was the highest. The correlations 
between fit patterns and behavioral 
ratings were much lower than those 
between EEG classifications and be- 
havioral ratings. In this study con- 
tamination between EEG data and 
behavioral ratings was avoided, but 


the reliability of the judgments was 
not assessed. 


RonscHAcH STUDIES 


A great majority of psychologists 
10 have studied the personality of 
ties have used the Rorschach 
Fach and the evidence from these 
; les will be considered first. A 
pane epileptic Rorschach protocol 
io toall or many epileptics has 

Mink escribed by some authors 
Bel e 1946; Rorschach, 1942; 
Aer di 1938). More cautiously, 
Mec is ave concluded that while 
CAM 110 Specific epileptic personal- 
ko us eptics share many traits and 
Hs identified by the use of the 

865 1 (Altable, 1947; Bovet, 

Fa oi ud & Marquilies, 1940; 

thes Ski, 1947; Zehrer, 1951). 

N emn however, have found 
E a aay significant difference 

Patterns ct e Rorschach scores and 
groups (K epileptic and nonepileptic 
ogan, 1947; Lisansky, 1948; 
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Shaw & Cruikshank, 1957). The 
theory that the personality of epilep- 
tics resembles that of brain injured 
patients is supported by a number of 
Rorschach studies (Piotrowski, 1947; 
Ross, 1941; Zimmerman, Burge- 
meister, & Putnam, 1951). Other 
studies, however, report evidence of 
neurosis but no signs characteristic of 
brain injury (Arluck, 1941; Kaye, 
1951; Richards, 1952; Zehrer, 1951). 
The belief that temporal lobe epi- 
leptics share a characteristic per- 
sonality is supported by three Ror- 
schach studies (Delay, Pichot, 
Lemperiere, & Perse, 1955; Gastaut, 
Morin, & Lesevre, 1955; Paillas & 
Subirana, 1950). The Rorschach 
pattern identified as characteristic is, 
however, quite different in each of 
these studies. 

There is thus evidence from Ror- 
schach studies for and against each 
theory about the personality of epi- 
leptics which has been advanced. In 
attempting to account for the con- 
tradictory nature of these findings, 
one is struck by two rather crude 
methodological errors that invalidate 
the great majority of the studies. 

1. Many of the earlier studies used 
institutionalized epileptics, a very 
small and uncharacteristic sample of 
the epileptic population. These pa- 
tients are mentally disturbed and 
may also be of low intelligence. All 
the descriptions of a specific epileptic 
Rorschach protocol derive from stud- 
ies of such patients. i 

2. The great majority of studies 
have not controlled for IQ. They 
have compared the Rorschach proto- 
cols of epileptics with the Rorschach 
norms, assuming either that intel- 
lectual level was not an important 
factor, or that the epileptic group and 
the Rorschach normal group were of 
similar intelligence. However, evi- 
dence has recently accumulated which 
shows that intellectual level affects a 
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great range of Rorschach responses 
in all-pervasive ways. Neff and 
Lidz (1951) in a study of army per- 
sonnel divided into three subgroups 
according to intelligence showed that 
only the group with superior intelli- 
gence gave Rorschach responses usu- 
ally considered typical of normal 
persons. Men with average and below 
average intelligence gave responses 
that by the usual Rorschach norms 
would be considered indicative of 
emotional disturbance. Similar find- 
ings are reported by Wedemeyer 
(1954). Later, Neff and Glaser (1954) 
showed that in Beck's normal group, 
used to furnish control data for many 
Rorschach studies, the percentage of 
those with high school education was 
two and a half times greater than in 
the general population. The epi- 
leptic protocols have thus been as- 
sessed and found in various ways 
abnormal by comparison with those 
of a group of above average intelli- 
gence. 

The same kind of evidence invali- 
dates most of the Rorschach studies 
of children. Control groups have 
rarely been used, and the epileptic 
protocols have been compared with 
published Rorschach child norms, 
such as the Ames norms. Fielder and 
Stone (1956) have shown, however, 
that three quarters of the children in 
the Ames Rorschach normal group 
came from professional and manage- 
rial classes. The responses from their 
own group of predominantly lower 
class normal children departed very 
considerably from the Ames’ criteria 
of normality. 

It thus appears that the conclusions 
of the great majority of Rorschach 
studies are unacceptable. In fact, the 
Rorschach scores of the epileptic 

groups resemble, and compare rather 
favorably with, those reported by 
Neff and Lidz and Wedemeyer for 
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normal service groups of equivalent 
IQ. Even if, however, one considers 
only those studies which compare epi- 
leptics living in the community with 
nonepileptics, matched for age and 
IQ, the results appear no less con- 
flicting than before. 

There are only five such studies 
published. In addition, two studies 
have compared the Rorschach proto- 
cols of temporal lobe epileptics with 
other epileptics. Lisansky (1948) 
compared the Rorschach protocols of 
10 adult noninstitutionalized epi- 
leptics and 10 diabetics, matched for 
age, education, and duration of ill- 
ness. Both groups were of average 
intelligence. She found that the only 
significant difference between them 
was the slower response time of the 
epileptic group. The Rorschach 
"epileptic signs" did not differentiate 
the two groups. 

Piotrowski (1947) compared the 
records of 25 epileptic adults, none 
psychotic, hospitalized, or conspicu- 
ously deteriorated” and 25 hysterics 
of similar age and IQ. He found 14 
“epileptic signs.” Six of these had 
previously been included in his scale 
for differentiating patients with or- 
ganic lesions. Most of the other signs 
can be found in various scales for 
differentiating neurotic or other dis- 
turbed patients. He also found that 
the presence of seven or more of these 
signs in a protocol identified 80% of 
his epileptic group, while none of the 
hysterics’ records had more than four 
signs. 

Arluck (1941) compared the Ror- 
Schach records of 16 idiopathic epi- 
leptics without an EEG focus, aged 
10-21, with control groups of their 
sibs, of cardiac patients, and of the 
sibs of cardiac patients. The groups 
were equated for sex, age, IQ, and 
socioeconomic status. He found sev- 
eral statistically significantdifferences 
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in the Rorschach scores. The epi- 
leptic group showed more signs of 
color shock, had fewer Ws in their 
records, and their mean time for re- 
sponse was greater. All three of these 
signs are included among Piotrowski's 
epileptic signs, and the first two occur 
in most Rorschach scales that claim 
to differentiate normals from any 
other groups. 

Shaw and Cruikshank (1951), how- 
ever, obtained negative results when 
they compared the Rorschach proto- 
cols of 25 institutionalized idiopathic 
epileptic children and 25 institution- 
alized nonepileptic children matched 
for age, sex, and IQ. The only sig- 
nificant differences found were in FV 
and number of different types of con- 
tent, both being larger for the epi- 
leptic group. 

Kogan (1947) compared the Ror- 
schach protocols of 10 idiopathic epi- 
leptic children with behavior prob- 
lems and 10 nonepileptic children 
attending the same child guidance 
clinic, matched for age, IQ, and 
Severity of emotional disturbance. 
She found no statistically significant 
differences between the two groups 
on any Rorschach variable. Clini- 
cally, both groups contained shy, 
quiet, anxious children and aggres- 
sive, antisocial children. 

Moy et al. (1955) studied the Ror- 
Er protocols of 50 epileptics of 
m dd intelligence classified accord- 
uM presumed etiology and also site 
hat A focus, if any. They found 
85 8% of the whole group gave 
Id or more of Piotrowski's epi- 
m 1 0 and 68% gave five or 
vm is organic signs. Patients 
lepsy” poe temporal epi- 
ele Ps significantly more extra- 
D eee orschach protocols than 
CMM 1 Gastaut et al. (1955) com- 
e Rorschach protocols of 

oninstitutionalized psychomotor 
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epileptics of average intelligence and 
a group of idiopathic epileptics. Only 
general impressions of the results were 
given. They reported that 72% of 
the psychomotor epileptics showed a 
syndrome of hypoactivity and emo- 
tional indifference or depression. It 
was the idiopathic group whose proto- 
cols were predominantly extratensive. 

There is thus on the one hand 
Piotrowski’s evidence that the Ror- 
schach protocols of epileptics re- 
semble each other but differ from 
those of hysterics by the presence of 
certain characteristic signs. This is 
partially supported by Arluck’s study, 
and by Delay’s finding that the Ror- 
schach protocols of 48% of an epi- 
leptic group of average intelligence 
contain a large number of these signs. 
On the other hand three further stud- 
ies have been unable to differentiate 
the Rorschach protocols of epileptics 
and normal children, disturbed epi- 
leptic children and disturbed non- 
epileptic children, and epileptic adults 
and diabetic adults. Two studies of 
the Rorschach protocols of patients of 
average intelligence with temporal 
lobe epilepsy reached opposite con- 
clusions. 

The extent to which the clinical 
status of the patients in these studies 
is comparable will be discussed below. 
The question must be raised, how- 
ever, whether there are any inade- 
quacies in the Rorschach test which 
make it an unsuitable tool with which 
to investigate the personality of 
epileptics. The primary concern of 
Piotrowski (1947) and many workers 
in this field was differential diagnosis. 
Yet the Rorschach test has repeatedly 
been shown to be an unsatisfactory 
diagnostic instrument. When the 
items originally found to differentiate 
two groups are put into the form of a 
scale and a cross-validation study is 
made by a different worker on a fresh 
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sample, the differentiating power of 
the scale invariably drops consider- 
ably (Yates, 1954). Subjective scor- 
ing systems and test unreliability no 
doubt contribute to this finding. The 
consistency of the subject, subject 
reliability, scoring reliability, inter- 
pretation reliability, and enquiry 
reliability have proved disappoint- 
ingly low (Baughman, 1951; Camp- 
bell & Fiddleman, 1959; Fiske, 1959), 
Moreover, where real differences in 
personality are known to exist be- 
tween groups the Rorschach test can- 
not be relied upon to detect them. 
Studies in the last decade which have 
taken care to avoid contamination of 
Rorschach interpretations with clini- 
cal data have found slight or no sig- 
nificant differences between the Ror- 
schach scores of hospitalized schizo- 
phrenics and normals (Friedman, 
1952), between different psychiatric 
groups (Wittenborn & Holzberg, 
1951), or between neurotics and 
schizophrenics (Reiman, 1953). Too 
much confidence cannot therefore be 
placed in the Rorschach test as an 
instrument for determining whether 
epileptics asa group resemble or differ 
from neurotics, normals, or brain 
injured groups. 

Some studies have used the Ror- 
Schach test asa tool not only to differ- 
entiate between epileptic and other 
groups, but to describe the character- 
istics of the epileptic personality, 
Thus Arluck (1941) deduced from the 
protocols of his group that they 
suffered emotional strain, had much 
conflict within their basic personality 
configuration, and tended to adjust 
by withdrawing from the external 
world. 

Such a use of the test depends on 
the assumption that different modes 
of response to the blots are deter- 
mined, in a manner that is known, by 
dominant and. enduring personality 
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traits. Thus an extratensive Ror- 
Schach experience balance (ie, a 
record with overemphasis on color) is 
said to be characteristic of an ego- 
centric emotionally explosive per- 
sonality. However, as Pruyser points 
out (1953), Rorschach, in common 
with most of his contemporaries, 
believed that epileptics are by con- 
stitution predisposed to these traits, 
When he found an unusually large 
number of color responses, especially 
CF and C, in the protocols of dete- 
riorated epileptics, he concluded that 
such responses were determined by 
emotional explosiveness. Alternative 
explanations were not considered or 
investigated. Recent experimental 
work, however, strongly suggests that 
there is no relationship between num- 
ber and type of color responses and 
emotional lability (Baughman, 1954; 
Keehn, 1954; Lazarus & Oldfield, 
1955). 

Attempts to validate the alleged 
relationship between Rorschach signs 
and such traits as constriction and 
impulsiveness have been no more 
Successful (Carp, 1950; Holtzman, 
1950). Very few such studies have 
been made. There is no adequate 
evidence for most of the Rorschach 
sign-trait correlations, and  inter- 
pretations such as Arluck's have to 
be accepted on faith. Hence, even if 
consistent findings had emerged from 
the Rorschach studies, their signifi- 
cance in terms of behavioral corre- 
lates would have been a matter for 
further experimental investigation. 


STUDIES OTHER THAN RorscHACH 
STUDIES 


There have been very few studies 
by Psychologists of the personality of 
epileptics which have not used the 
Rorschach test, Meyers and Brecher 
(1941), using the Kent-Rosanoff 
Word Association Test, found no sig- 
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nificant differences between a group 
of idiopathic epileptics and normal 
persons, matched for age, sex, IQ, and 
socioeconomic status. Arluck (1941) 
found no significant differences be- 
tween the scores of his epileptic and 
control groups on a level of aspiration 
test and a personality questionnaire. 
These results are difficult to evaluate 
in the absence of evidence that the 
tests can differentiate significantly 
between any diagnostic groups what- 
soever. 

Davies-Eysenck (1950) assessed 38 
adults suffering from idiopathic epi- 
lepsy, using three tests of neuroti- 
cism, which had been shown to dis- 
criminate very significantly between 
normals and neurotics. Even though 
she selected only those patients who 
were not mental defectives, were 
capable of paid employment, and 
fairly regular in clinic attendance, 
she found the mean score of the group 
one standard deviation towards the 
neurotic end of the scale. There was 
no correlation between length of ill- 
ness and degree of neuroticism. 

Halstead (1957) compared 28 epi- 
leptic children attending ordinary 
Schools, 12 epileptic children attend- 
ing a physically handicapped school, 
and 28 attending a special epileptic 
tesidential school, with 54 normal 
children. The investigation was pri- 
marily concerned with cognitive abil- 
T and educational attainments, 
: 5s behavior was assessed from in- 
et supplied by parents and 
TR ols. Thirty-seven percent of the 
12 eptic group were considered to 
i good behavior, 35% to have bad 
ie jr (aggressive, destructive, 

a and 28% to have negative be- 
TAN r Ud timid, oversensitive, 
P ede only variable analyzed 
tion Showed a significant correla- 
i With bad behavior was attend- 

ce at the special epileptic school. 
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This was, of course, one of the main 
reasons for referral. Positive but not 
significant correlations with bad be- 
havior included brain injury, symp- 
tomatic epilepsy, frequent seizures, 
and longer duration of epilepsy. 
There were no statistically significant 
correlations between negative or good 
behavior and other variables, al- 
though correlations between good 
behavior, normal milestones, short 
duration of epilepsy, having major 
seizures, and attendance at normal 
school were all positive. There were 
no known cases of brain injury among 
the group with good behavior. 

This study is of particular interest 
because of its representative sample 
of epileptic children. Unfortunately, 
the method of making the behavioral 
ratings is not recorded, and the reli- 
ability of the ratings was not assessed. 

Gastaut et al. (1955) compared the 
performance of 60 adult psychomotor 
epileptics of average intelligence and 
an unspecified group of idiopathic 
epileptics on a variety of psycho- 
logical tests. They described 72% of 
the psychomotor epileptics as slow 
and adhesive, with a flat depressed 
affect which was reflected in their 
TAT stories. The idiopathic epi- 
leptics on the other hand were found 
to be quick, hyperactive, and emo- 
tionally labile. The test results on 
which these conclusions were based 
were not reported, and the EEG and 
clinical criteria by which the epi- 
leptic groups were selected were not 
defined. 

Grunberg and Pond (1957) com- 
pared from case history records the 
family background of three groups of 
children who had attended the 
Maudsley Hospital: 53 epileptics 
with conduct disorders, 53 epileptics 
without conduct disorders, and 33 
nonepileptics with conduct disorders. 
They found very similar adverse 
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factors (disturbed parental attitudes, 
marital disharmony, breaks and 
changes in the environment) were 
present in the families of both groups 
of children with conduct disorder, but 
absent from the background of well- 
adjusted epileptic children. The 
method by which the adverse factors 
were assessed is not described, andthe 
reliability of the judgments was not 
assessed. This study confirms the 
earlier finding of Sullivan and Gahagen 
(1935) that epileptic children with 
serious personality or conduct dis- 
orders "had almost without excep- 
tion a poor home environment." 


Discussion 


The work surveyed above gives no 
support to the theory that all or most 
epileptics share a characteristic per- 
sonality. It is, however, hardly ade- 
quate to affirm or refute the other 
theories that have been advanced. 

is some evidence that the 
incidence of Personality disturbance 
may be high among epileptics, or 
among some groups of epileptics, and 
that different types of Personality 
disturbance are associated with differ- 
ent types of epilepsy. From Grun- 
berg and Pond's and Nuffield's stud- 
ies cited above one may infer that 
these personality differences result 
from a complex interaction of environ- 
mental and pathophysiological fac- 
tors The contribution of heredity 
has not been studied, although 
Harvald (1954) has shown that 
psychosis, psychopathy, suicide, and 
criminality do not occur more fre- 
quently among the relatives of epi- 
leptics than in the general population. 

Little progress is likely to be made 
unless future studies define much 
more precisely than before the char- 
acteristics of the epileptic group on 
whom observations are made. Most 
of the studies reviewed above have 
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described their patients only in terms 
of age, sex, and IQ, and classified 
their epilepsy either as idiopathic or 
symptomatic, or as involving major 
or minor seizures. Neither method of 
classification has proved illuminating, 
and both obscure factors that there is 
reason to believe may be important. 

Patients showing no signs of brain 
lesion are said to have idiopathic 
epilepsy, a condition believed to re- 
sult from an inherited instability of 
cerebral function (Lennox, Gibbs, & 

Gibbs, 1940), and differing markedly 
from symptomatic epilepsy, which is 
associated with demonstrable cere- 
bral lesions. However, there is evi- 
dence of a multiple etiology in all 
epilepsies. Heredity has been shown 
to be an important factor in sympto- 
matic epilepsy (Williams, 1950) while 
a high incidence of twin births and 
breech births has been found in the 
history of patients with petit mal 
seizures and 3 per second WS EEGs, 
the classical form of idiopathic epi- 
lepsy (Churchill, 1959). 'The diag- 
nosis of idiopathic epilepsy depends 
mainly on negative evidence—the 
lack of evidence of a focal lesion or of 
a history of brain injury. But as our 
understanding of epilepsy grows, 
more precise diagnoses can be made. 
Many patients, for example, who 
were classified as idiopathic epileptics 
in the studies reported above would 
now be diagnosed as cases of temporal 
lobe epilepsy, with presumed focal 
lesions. 

T One might predict that classifica- 
tion of patients into those having 
major or minor seizures, or both, 
would have significance at least in 
terms of the different psychological 
effects of these types of seizure on the 
patient and on Society. In fact, how- 
ever, this prediction is not supported 
by Halstead's (1957) study. He found 
a correlation, not significant, between 


behavior and having major 
es. Nor has this method of 
ification neurophysiological sig- 
ficance. Any type of minor seizure 
nay develop into a major one, and 
sth clinical forms may be associated 
focal, diffuse, or no known lesions 
cortical or subcortical origin. 
in the supposedly homogene- 
groups of idiopathic or sympto- 
tic epileptics, or epileptics suffer- 
from major or minor seizures, 
nvestigated in the studies reviewed 
bove, it is likely that the most varied 

ge of brain dysfunction and lesion 
rred. If personality differences 
between epileptics are related to 
mei ophysiological factors, these 

üdies could not have revealed them. 


clinical classification of epilepsy is 
dly more adequate for research 
This can be illustrated 
th reference to the concepts of 
temporal lobe epilepsy and petit mal 
epsy. The former term is gen- 
y used to denote clinically de- 
d psychomotor seizures occurring 
patients with a temporal lobe EEG 
us. It has been shown, however, 
Such a focus can be associated 
all kinds of clinical manifesta- 
tions. Only 46% of one series of epi- 
Jeptics with EEG ictal discharges in 
» temporal region had clinical 
Monet seizures (Jasper, Per- 
m t, & Flanigin, 1951). Moreover, 
8 1940) found that in 110 cases 
P th clinical psychomotor seizures, 
2 Y 32 had temporal lobe EEG foci. 
T &ctroclinical correlations are even 
3 955) in children (Glaser & Golub, 
E Glaser and Dixon (1956) were 
E e to find any relationship be- 
Ween response to a particular drug, 
ical form of seizure, or type of 
abnormality. Postmortem stud- 
t A ave revealed a great variation in 
ae extent, type, and location of 
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lesions in such patients (Gastaut, 
1953). 

Thus, while the term temporal lobe 
epilepsy implies the existence of a 
condition in which clinical symptoms, 
EEG findings, pathology and response 
to drugs are highly correlated, in fact 
the only common feature among pa- 
tients so diagnosed might be their 
common site of discharge in the re- 
gion of the temporal lobe. Symonds 
(1954) has suggested substituting the 
term “the temporal lobe epilepsies” 
to emphasize the variety of condi- 
tions found. For research purposes, 
it is desirable to select patients by well 
defined EEG and/or clinical criteria, 
rather than by diagnostic labels. 

The term petit mal epilepsy is also 
inadequate for research purposes. 
Originally a clinical term for any 
minor attack, it took on a more pre- 
cise meaning when Gibbs, Davis, and 
Lennox (1935) observed that momen- 
tary seizures, with or without falling, 
are often accompanied by a 3 per 
second SW EEG pattern. This elec- 
troclinical association is often called 
“true” petit mal, and appears to re- 
spond specifically to the dione drugs. 
Clinically identical momentary Sei, 
ures may, however, be accompanied 
by bursts of spikes or by complex 
wave forms, and at least one third of 
epileptics with 3 per second SW have 
no petit mal attacks (Clarke & 
Knott, 1955; Lundervold, Henriksen, 
& Fegersten, 1959; Silverman, 1954). 
Unless both EEG and clinical data 
are given, the type of minor attack 
cannot be identified. } 

A satisfactory classification of the 
epilepsies awaits greater understand- 
ing of their mode of action. Mean- 
while, in any attempt to establish the 
behavioral characteristics of particu- 
lar groups of epileptics, itis important 
to select the patients according to a 


criterion whose reliability is known, 


206 


and which there is some evidence to 
consider significant. Nuffield's study 
has shown the importance of EEG 
criteria. The reliability of such 
criteria has not yet been adequately 
studied, and the improvement of 
reliability probably awaits adequate 
quantification of certain EEG phe- 
nomena, 

An important variable which has 
been neglected is the type and amount 
of medication taken. This is not 
specified in any study. However, the 
drugs used to control seizures have 
important effects on the nervous 
system, some excitatory, some in- 
hibitory. One might reasonably 
postulate that personality differences 
between different groups of epileptics 
are primarily a function of the differ- 
ential long term effects of different 
types of drugs. Loveland, Smith, 
and Forster (1957) tackled this prob- 
lem, but their study covered a period 
of 3 months only, and the numbers in 
each group were very small. 

While it has often been suggested 
that Personality disturbance in epi- 
leptics occurs only in the presence of 
cerebral lesions, this factor has 
been inadequately studied. Halstead 
(1957) found a Positive but not sig- 
nificant correlation between bad be- 
havior in epileptic children and a 
history of brain injury, and Vislie and 
Henricksen (1958) found a tendency 
for the severity of personality disturb- 
ances in epileptic adults to be re- 
lated to the extent of brain lesion, as 
evaluated by neurological symptoms, 
Pneumoencephalography, and EEG 
findings, 

Without autopsy it is, however, 
often difficult to assess the presence, 
extent, and site of brain damage. A 
good history of birth injury is some- 
times available, but there is no simple 
relationship between birth trauma 

and brain injury. In a series of 406 
stillbirths and neonatal births, only 
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24% which had shown signs of cere- 
bral irritation gave postmortem evi- 
dence of intracranial trauma (Bound, 
Butler, & Spector, 1956). Moreover, 
numerous cases are described in 
which serious lesions of the brain 
have been found at postmortem 
Which were not suspected at birth. 
Neurological signs may provide 
positive evidence of brain damage, 
but large sections of the brain may be 
damaged without giving rise to neuro- 
logical signs. Pneumoencephalo- 
graphic findings are rarely available 
and can only be regarded as valid 
criteria of brain damage when the 
lesion is gross. EEG abnormalities 
are not in themselves evidence of 
brain damage, nor do they necessarily 
represent accurately the locus and 
extent of brain damage. The EEG 
focus may be remote from the lesion, 
and the lesion may be more or less 
extensive than the focus. This is 
particularly true of temporal lobe 
foci, which because of the low convul- 
sive threshold of the temporo-orbital 
regions may be Secondary to a pri- 
mary lesion elsewhere, Hence, esti- 
mates of the extent and locus of brain 
damage can with the techniques at 
present available be at best approxi- 
mate. However, Meyer, Falconer, 
and Beck (1954) have found temporal 
obe lesions in the brains of all epilep- 
tics with temporal lobe EEG foci who 
have come to autopsy, and according 
to Gastaut (1953) such lesions are 
more diffuse than those found in 
patients with any other focal epilepsy. 
he question arises, therefore, 
whether the relationship between per- 
sonality disturbance and temporal 
lobe EEG foci is a function of the 
diffuse nature of the lesions rather 
than their location An investigation 
of the relationship between these 
factors would seem to involve cor- 
relating behavioral studies and EEG 
studies with autopsy findings. 


PERSONALITY OF EPILEPTICS 


The contribution of social class 
differences to personality differences 
between epileptics has not been 
studied, and the social class distribu- 
tion of different forms of epilepsy is 
not known. It is often held, however, 
that temporal lobe epilepsy is more 
Írequent in the lower social classes, 
and there is some evidence that ag- 
gressive conduct disorders are more 
frequent in children of lower social 
class (O'Neal & Robins, 1958). It 
would therefore seem worthwhile to 
determine what contribution, if any, 
social class makes to the association 
between temporal lobe epilepsy and 
aggressive personality disorders. 
What has to be studied, in fact, is the 
behavior of individuals with mal- 
functioning nervous systems, dam- 
aged in different areas at different 
stages of development, controlled to a 
greater or lesser extent by different 
drugs, acting on and responding to 
different kinds of environment. It is 
clear that the methodological prob- 
lems involved are very complex and 
hardly touched by the usual studies 
that control age and IQ. While most 
of the variables cannot be experi- 
Mentally controlled, some of their 
interrelationships are open to study. 
wean problem which has 
of si e erent attention 18 that 
ole ing. Epileptics attending a 
a; logical or psychiatric institute, 
(crim over a long period are 
vet differ in important respects, 
tics 115 Psychological characteris- 
E epileptics attending a 
Pis hospital. Pond s survey of 

on ES general practice has shown 
only b ny epileptics are attended 
that 10 general practitioners, and 
Ospitals t, Seen as outpatients at 
thologically ai Pe Te Bid. 
well, 1960) isturbed (Pond & Bid- 

5 oe Problem, not specific to this 

ies in the selection of dimen- 
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sions of personality and the develop- 
ment of reliable and valid instru- 
ments with which to measure them. 
For many years the only psycho- 
logical tool used was the Rorschach 
test, the inadequacies of which have 
been discussed above. So far the 
main alternative has been the clinical 
assessment of behavior. The reliabil- 
ity of this method is generally unde- 
termined, and indeed the develop- 
ment of criteria to assess such be- 
havioral manifestations as hyperkine- 
sis or distractibility is one of the more 
neglected problems in psychology. 

To the present writer, however, 
the most fruitful approach appears to 
be one which relates personality to 
more elementary forms of psycholog- 
ical functioning. In epilepsy it is 
known that we are studying the 
behavior of a nervous system which 
is certainly malfunctioning, often 
more or less diffusely damaged. The 
intelligence test scores of epileptics 
with known or suspected brain lesion 
tend to be below average (Collins, 
1951), but there has not yet been any 
study of the nature of the psycho- 
logical functions impaired in these 
patients. It would seem worthwhile 
to investigate the relationship be- 
tween the emotional and behavioral 
disturbances reported and more gen- 
eral defects, e.g., impairment in dis- 
criminative functions, or slowness to 
condition. À 


SUMMARY 


Five basic theories about the per- 
sonality of epileptics are outlined, 
and the extent to which they have 
been affirmed or refuted by clinical 
and psychological investigations is 
considered. The findings of studies 
which have used the Rorschach test 
are shown to be contradictory, and 
the inadequacies of this test for re- 
search purposes are pointed out. It 
is argued that progress in this field 


depends on a recognition and study 
of the complex environmental and 
pathophysiological factors involved, 
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and on the development oí reliable 
criteria with which to classify epilep 
tics. 
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MEANINGFUL AND UNMEANINGFUL ROTATION 
OF FACTORS 


JOHN W. THOMPSON 
University College of North Staffordshire 


Whereas hitherto the proponents of 
simple structure have admitted that 
a perfectly objective series of rota- 
tions which, according to them, would 
ensure a solution that correctly mir- 
rored reality, still remained beyond 
their grasp, it now appears to be only 
à matter of time before electronic 
computing techniques make undis- 
putably unique solutions possible. In 
the writer's opinion the fundamental 
problem as to whether mathematic- 
ally exact solutions mirror reality 
will remain, and judgmental methods 
will not thereby be outmoded. In 
this article the historical background 
to problems of rotation is described 
and the problems discussed. The 
views of several well-known factor 
analysts—e.g., Cattell and Burt—on 
the subject of rotation are compared, 
and it is suggested that, as there will 
still be room for judgmental methods 
factor analysts, in addition to pro- 
framing exact simple structure, 
should take up and develop sugges- 
tions for rotation made by H. J. 
Eysenck and W. Stephenson. It is 
also Suggested that insufficient dis- 
tinction is made between the polarity 
» factors derived from different 
po of data, and that whereas some- 
is i i is immaterial whether a factor 
5 ar or otherwise, on other oc- 
85 it is important, so that in 
Sí GE tests involving different kinds 
in . ought to bear 
Bolen the polarity of resulting 
iig According to whether the 
zd are unidirectional or bipolar, 
ES De should also pay more 
a on to the way in which big 

small loadings are distributed 


within individual factors, instead of 
confining their interest to overall ar- 
rangement, as in simple structure, 
cluster analysis, and proportional 
profiles. In this way it would help to 
clear up difficulties connected with 
polarity which have apparently arisen 
in the interpretation of Eysenck's T, 
and would widen the scope of factor 
analysis as an instrument of mean- 
ingful classification. Far from decry- 
ing objective rotation, the writer 
suggests its extension to new applica- 
tions, provided that a parallel exten- 
sion is permitted in the development 
of judgmental methods. 


BACKGROUND TO THE PRESENT 
SITUATION 


It is customary, especially among 
American psychologists, to use trans- 
formations of the initial factor ma- 
trix rather than to interpret the load- 
ings as derived, but some psycholo- 
gists maintain that rotation is op- 
tional or even undesirable. The ques- 
tion, “Should rotation be entirely a 
mathematical procedure, or should it 
be based on ‘psychological mean- 
ing?’ is historically an old one, and 
whether or not to rotate, is bound up 
with (a) the status to be accorded to 
factors—whether they are principles 
of classification, functional unities, or 
unique entities, and (b) whether 
factor analysis is to be regarded as a 
branch of dependent or interdepend- 
ent statistics—whether its purpose is 
to prove a hypothesis or to explore. 
These considerations have been de- 
bated at length in a paper by Kendall 
and Babington Smith (1950), and in 
articles by Eysenck (1944, 1950, 
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1952), and considerable space has 
been devoted to rotational procedures 
in the main texts on factor analysis, 
e.g., those of Adcock (1954), Burt 
(1940), Cattell (1952), French (1953), 
Holzinger and Harman (1941), 
Stephenson (1953), Thomson (1939), 
and Thurstone (1947). In spite of 
all that has been said, rotation con- 
tinues to be a source of controversy 
and ambiguity, Cattell and French 
maintaining that inadequate rotation 
is responsible for more failures in 
factor analysis than anything else. 
The situation requires clarifying, as 
the otherwise powerful technique of 
factor analysis is weakened by wide- 
spread disagreement. Moreover it is 
essential to make such an effort now, 
for the gap between the psychologists 
who say that rotation is essential, 
and those who say it is optional, is 
likely to widen with the further de- 
velopment of electronic computing 
machines. We would now appear to 
be approaching the point at which it 
will be possible to arrive at a unique 
simple structure even for correlated 
factors, a procedure that has previ- 
ously been too difficult. The writer 
suggests that objective and judg- 
mental rotation, although rarely em- 
ployable by the same operator, using 
the same data, on the same Occasion, 
are nevertheless techniques which are 
complementary, if a sufficiently broad 
view is taken of the Scope of factor 
analysis. The writer would accord- 
ingly like to see a corresponding for- 
ward advance in the development of 
judgmental methods of rotation. 


Eysenck's Use of Rotation 


Eysenck has on the whole adopted 
a middle of the road position. He 
says (Eysenck, 1944) that whether or 
not factors are rotated, the various 
methods of factor analysis, far from 
being incompatible, are alternative 


approximations of one another. This 
point of view is, however, insufficient 
to account for Eysenck's discovery of 
the basic attitudes Radicalism-Con- 
servatism and Tendermindedness (R 
and T), if his claim to have achieved 
it essentially through a reinterpreta- 
tion of previous factorial studies by 
undoing the previous work of rota- 
tion (Eysenck, 1954) is to be accepted. 
Eysenck mentions, in particular, the 
analyses of Thurstone (1934), Lurie 
(1937), Duffy and Crissy (1940), and 
Hatt (1948). But how far, in fact, 
was the recalculation of centroid 
factors essential to Eysenck's dis- 
covery? Cattell maintains that it is 
merely a matter of chance whether 
unrotated factors are meaningful, and 
on that view it would be just pos- 
sible, due to chance, for the centroids 
of several successive analyses to be 
capable of a common intepretation, 
but a more probable explanation is 
that even when the procedure of 
factor identification is believed to be 
completely objective, a certain 
amount of. personal judgment is in- 
volved, and that in orginally dis- 
covering Conservatism-Radicalism 
and Tendermindedness, Eysenck un- 
consciously read meaning into the 
unrotated factors of the several anal- 
yses, in support of a brilliant but al- 
ready formed hypothesis. At any 
rate, in the course of Eysenck's fur- 
ther investigations, either the chance 
Sequence of meaningful centroids 
broke down or Eysenck realized that 
he could not continue to get support 
for his hypotheses from further un- 
rotated factors, because in the course 
of analyzing the data in his social 
insight" study (Eysenck, 1951) he 
was forced to admit the necessity for 
rotation through 47 degrees to equate 
his centroid factors with the previ- 
ously identified R and T. Eysenck 
(1954) has elsewhere demonstrated 
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that the angle of separation between 
R and T and Ferguson's (1952) 
Religionism and Humanitarianism 
was 45 degrees. From Eysenck’s 
social insight study the equivalent of 
Ferguson's and not Eysenck's basic 
attitude factors must therefore have 
emerged directly, an event which was 
presumably unexpected, as up to 
that time Eysenck had cited the 
direct emergence without rotation of 
his own factors R and T as a reason 
for preferring them to Ferguson's 
Religionism and Humanitariansm. 
This experience of Eysenck's would 
seem to dispose of any argument that 
centroid factors as such have a special 
meaningful property. 

It was Eysenck (1950) who intro- 
duced "criterion analysis," using 
Totation to align a factor with a 

criterion column” incorporated in 
the correlation matrix. This pro- 
cedure has had a mixed reception. 
Cattell, for example, has shown a lack 
of enthusiasm for it, but it is difficult 
to see the reason for this, as Thur- 
Stone himself, although he did not in- 
Corporate a criterion column like 
Eysenck, or provide a mathematical 
d such as Lubin's (1950) for 
(AM about the necessary trans- 
1 705 on, recommended the inclusion 
e factor analytic population of 
Bean having widely varying char- 
P Pas Criterion analysis is quite 
de srd pus the Thurstonian tra- 
aeni P oo S chief objection 
Bos e that unless criterion 
9 0 iffer in respect of more fac- 
Sedis one, such groups constitute 
9 B but he surely 
criterio e point as the art in using 
ally 1 analysis depends on skill- 
9990 "Haad just such groups which 

m special populations. 


E Views on Rotation 
attell’s ideal is Thurstone’s sim- 
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ple structure, giving an invariant 
solution, but he appears to vacillate 
in his views as to how far a unique 
solution is at present actually obtain- 
able, especially in the case where 
factors are correlated. Cattell's 
(1952) earlier position would appear 
to allow in practice for more than 
one solution with approximately sim- 
ple structure, depending upon the 
way in which the analyst proceeds in 
the initial stages of rotation, and the 
moves he makes when he begins by 
trial and error to seek out promising 
positions. More recently Cattell 
(1957) under the influence of Barg- 
mann (1954) has openly advocated 
blind rotation, maintaining that a 
factor analyst should be quite un- 
aware of the nature of the variables 
with which he is working, and saying 
that important sharpening of hyper- 
planes, involving essential changes of 
position over and beyond refinement, 
is delayed until the very closing 
stages. Cattell (1957) quotes Barg- 
mann as saying that several alterna- 
tive roughly satisfactory “simple 
structures” can be found which reach 
the 10% or even the 5% level of sig- 
nificance, but which are nevertheless 
(a) quite spurious in factor terms, 
and (b) emphatically below the sig- 
nificance (often at the .001 level) of 
the simple structure obtainable after 
thorough exploration of possible rota- 
tion postions. It is evident from ar- 
ticles by Cattell (1955), Kaiser (1958), 
Neuhaus and Wrigley (1954), and 
Sokal (1958) that analytical solu- 
tions are now commonplace. On first 
thoughts it might appear probable 
that, as electronic computing tech- 
niques continue to improve, alterna- 
tive solutions will be discovered at 
even higher levels of significance, so 
that the mathematically perfect solu- 
tion might remain forever out of 
reach—indeed Saunders (1960) seems 
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to envisage something of the kind. 
This let-up is, however, only tempor- 
ary as similar arguments have ap- 
peared before. Stephenson, for ex- 
ample, feeling that it was difficult to 
accept one kind of geometrical sub- 
structure as, in principle, the only 
basis for inference, found consolation 
in the fact that, in practice, by way 
of single-plane and other solutions to 
the rotational problem, Thurstonian 
procedure allowed far more latitude 
than might appear at first sight 
(Stephenson, 1953, p. 41). Second 
thoughts on the subject indicate that 
if in allegedly unique solutions there 
is still room for permissiveness, it is 
only a matter of time, both in the 
orthogonal and oblique case, before 
undisputably unique solutions ap- 
pear. Even when this occurs, how- 
ever, it will not decide the issue of 
objective versus meaningful rotation, 
for the exact solution need not neces- 
sarily mirror reality, and for certain 
purposes, and particularly with cer- 
tain kinds of data, it would be advis- 
able to develop also solutions based 
on judgment. Even Cattell cannot 
entirely escape the element of judg- 
ing, for when arguing most strongly 
in favor of exact solutions, in the next 
breath he contradicts himself by ad- 
mitting the necessity for judgmental 
intervention. Thus he says (Cattell, 
1957), 


If, very occasionally, our art of judgment 
seems to defy a mechanical following of our 
own rules, it should be remembered that ex- 
perience necessarily takes in additional con- 
siderations beyond the index value, notably; 
statistically sophisticated estimates of round- 
ing errors; effects of different communality 
estimates; differences of sample in mean and 
heterogeneity; degree of reliability of par- 
ticular tests in particular situations; changes 
of method in factor extraction, and the degree 
of “wobble” of the reference vector to be ex- 
pected in even the best visual simple structure 
determination. These are matters of good 
total perspective, and meticulous attention 
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to sampling error and test reliability will avail 
the experimenter nothing if he shows negli- 
gence in evaluating rotational procedures and 
checking that simple structure has truly been 
achieved (p. 232). 


Cattell also insists that a factor, as 
given by simple structure, must be in 
agreement with previous studies and 
with prevailing opinion. Thus it 
would seem to be hard to deny the 
necessity of the investigator's judg- 
ment, and the problem when to use 
it and how to incorporate it satis- 
factorily in rotation procedure. 


Sir Cyril Burt's Position 


Burt has tended to get less involved 
than other factor analysts in contro- 
versies on rotation, and it can be 
argued that in sectioning matrices by 
visual inspection of correlation clus- 
ters he in fact by-passes the problem. 
Burt (1940, p. 250) maintains that 
the primary value of factors is de- 
scriptive, that they are principles of 
classification, and that it depends 
upon the design of the experiment as 
to whether factors are meaningful or 
unmeaningful. The previously given 
quotation of Cattell's is in agreement 
with the last of these observations. 
Burt appears to be satisfied with the 
situation in which the Thurstone 
school work with correlated oblique 
first-order factors and arrive at sec- 
ond-order factors, which are orthogo- 
nal, while Burt works with orthogo- 
nal factors which include basic factors 
with which he says the second-order 
factors are approximately equiv- 
alent. However, it seems to the 
writer that Cattell’s second-order 
factors are something more than 
Burt's basic factors because of Cat- 
tell's emphasis on levels. Cattell's 
second order factors belong to a dif- 
ferent "computing realm" from lower- 
order factors and there is, he says, 4 
risk of confusing the computing 
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realms by altering the generality of 
material from one analysis to another, 
which needs to be specially guarded 
against. If not, we are told, factors 
which are ostensibly the same, but 
actually different, may result. French 
(1953) describes several analyses with 
material of differing generality re- 
sulting in first-, second-, and allegedly 
third-order factors, viz., those of 
Baehr (1952), Lovell (1945), and 
Thurstone (1947, 1951). Should there 
be any limit to the number of dif- 
ferent factor orders, and if not, do 
the higher levels mean anything at 
all? If it is imperative to rotate first- 
order factors to simple structure, 
why is it not also necessary to rotate 
higher-order factors to give a unique 
solution at that level, and in order 
that they can be equated with sim- 
ilar factors and not confused with 
factors at other levels in different 
analyses? This reductio ad absurdum 
in connection with higher-order fac- 
tors suggests that the proponents of 
simple structure have no grounds for 
-a of logical self-sufficiency, and 
T T reason alone investigators 
m equip themselves with a 
5 x y of rotational procedures, some 
ich are judgmental. 


Stephenson’ s “Simplest Structure" 


ee (1953, p. 37) says 
95 jm at Thurstone describes as 
peris structure is in fact the counter- 
ki m variance design, of confounded 
m " complex designs for struc- 
for E mples (in Thurstone's case 
Bon 2 es of persons), the explana- 
1 6 5 zu for Thurstone's various 
Upon the eing imputations placed 
a oron who could be repre- 
i alanced or other factorial 

len or samples of persons. 
T says that the difficulty 
ute can 1 is that only one attri- 

e measured at a time, and 
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goes on to say (p. 40) that all scien- 
tific behavior is relatively specific to 
each experimental situation. As 
mentioned above, according to 
Stephenson, the virtue of Thurstone's 
centroid method, far from being 
uniqueness of simple structure, is its 
alleged permissiveness, which Stephen- 
son regards as admirably suitable for 
the doctrine in experimentation of 
“the concreteness of inferential be- 
havior.” Stephenson believed that 
Thurstone, in so far as he sought to 
limit this permissiveness, broke the 
axiom of concreteness of behavior. 
Stephenson (1953) describes his 
attitude to rotation as follows, 
In our own case the rotations we pursue follow 
two broad principles. For unstructured 
samples we seek to determine sometimes 
what orthogonal structure best fits the data, 
for a balanced block design of effects, usually 
for two levels each [representing positive and 
negative loadings, respectively, on the fac- 
tor]. A balanced block design is called a case 
of “simplest structure” to distinguish it from 
Thurstone’s concept of “simple structure.” 
Ours are always orthogonal, but attention is 
also given to some properties of the structure 
which are widely overlooked in multiple fac- 
tor analysis. We not only seek to “explain” 
factors a, B, ^... but we also ask that all 
possible combinations of the factors such as 
aß, BN, aß should be explained. The inter- 
pretative power of a factor rests in the combi- 
nation it helps to explain, as distinct from its 
analytic power which conerns the explanation 
it provides in Thurstone's sense vis-à-vis a pri- 
mary factor (p. 41). (Copyright 1953 by the 
University of Chicago) 


In a further explanatory passage 
(ibid., p. 108) Stephenson says that 
the search for simple structure has 
led to interest being centered on tests 
which are pure“ with respect to one 
factor, that is, which are loaded in 
that factor and no other, whereas if 
there are three factors any variable, 
as far as being significantly loaded is 
concerned, may be related to the 
three in any of eight ways, and these 
ways determine whether the variable 
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is "pure" or mixed“ with reference 
to the factors. Whereas, says Stephen- 
son, it has become almost axiomatic 
in factor analysis to regard the pure 
cases of central importance, in fact 
the mixed combinations as well as 
the pure should be taken into con- 
sideration, as also indeed should the 
null case—the variable not loaded on 
any of the three factors. Stephenson 
maintains that only full considera- 
tion of all the possibilities, pure,“ 
"mixed," and “null” gives adequate 
explanation, and the position which 
gives this full explanation is that of 
simplest structure. Stephenson ro- 
tates with a view to obtaining a small 
number of factors which, together 
with their combinations, account for 
the same data, instead of seeking a 
solution involving as many pure 
classes as possible. Probably because 
it admittedly involves judgement as 
well as mathematics, Stephenson's 
approach has received an even less 
enthusiastic reception than Eysenck's 
criterion analysis from the Thurstone 
school. 

At this point the following argu- 
ment for developing judgmental as 
well as objective methods of rotation 
is relevant. Although it is usually 
considered an advantage that ana- 
lytical methods of rotation such as 
quartimax (Neuhaus & Wrigley, 
1954) diminish the chances of any 
particular variable being overrepre- 
sented or underrepresented in the 
factor constellation as a whole, for 
specific purposes such one-sidedness 
may actually be demanded. Hence 
the need for judgmental as well as 
analytical procedures. To take a 
simple illustration, when a person 
with a British railway ticket travels 
from London to Weymouth, the 
route printed on the ticket is “via 
Theale,” and similarly when travel- 
ing from London to Bournemouth it 
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is "via Sway." The more important 
the route, the less important is the 
station defining the route, Theale and 
Sway, in particular, being small 
places meaning nothing to the ordi- 
nary traveler. Such stations are, how- 
ever, important to the railway com- 
pany for discriminating between dif- 
ferent journeys and form part of a 
system of meaningful coordinates. In 
a similar way, it may require a dif- 
ferent kind of factor constellation to 
satisfy an expert with a specific pur- 
pose, than to fulfil the requirements 
of an analyst whose purpose: are less 
exact and more general. 


EXTENSION OF CRITERION ANALYSIS 
AND SIMPLEST STRUCTURE 


An essential feature of criterion 
analysis is incorporation within the 
correlation matrix of a column of cor- 
relation coefficients between each of 
the tests“ and the criterion, the 
latter being usually a characteristic 
or behavior index differentiating cri- 
terion groups. The criterion can, if 
desired, be established by means of 
preliminary experiments involving 
Eysenck's (1950) principle of double 
maximization which entails alternate 
purification of criterion and test. The 
following extension of the use of the 
criterion is suggested. In his study 
of basic attitudes Eysenck incorpo- 
rated the criterion in the factor matrix 
only in the main investigation, but 
omitted it in subsequent studies on 
smaller populations in specific areas 
and under varying conditions, taking 
it for granted that the R question- 
naire continued to measure Radical- 
ism-Conservatism. It would be better 
in some investigations to choose à 
criterion which could be included in 
subsequent studies as well as in the 
initial investigations. Such a crite- 
rion would have to belong enduring, 
concrete, easily identifiable, and in- 
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dependent of location—which may, 
at first sight, appear to be asking for 
too much, but such criteria could be 
found, and the search would be re- 
warding. It is suggested that 
Stephenson's notion of simplest struc- 
ture, in accordance with which the 
the factor analyst looks for a few 
factors that, together with their 
combinations, account for the data, 
could be helpful in choosing a crite- 
rion, and that such a criterion would 
be superior to the markers carried for 
the purpose of obtaining near-unity 
loadings that are customarily used 
to identify factors in successive ex- 
periments, as such markers are like 
dummy tracer mixed with live am- 
munition, there being no guarantee 
that superficial characteristics which 
are easy to follow are consistently 
correlated in successive experiments 
with the remaining experimental ma- 
terial. A criterion selected in the 
manner suggested could keep a whole 
series of investigations in alignment 
without ambiguity of interpretation. 


PoLARITY iN Factor ROTATION 


Se also proposed that in rotation 
im importance should be at- 
si ed to the polarity of factors; this 
"o rug can be conveniently illus- 
ae by further reference to crite- 
e the case of a bipolar 
180 ‘a es criterion would, of course, 
ud ipolar. In criterion analysis 
Bun 5 arise regarding the 
Mss 1 1 à continuum, for Eysenck 
iris aed that criterion analysis 
ie ns more than the mere rotation 
8155 ue as had been previously 
ihe RE , and has claimed that in 
bitin Se of neuroticism, by incorpo- 
vows Ven his matrix of correla- 
mi unt he called “the criterion 
Sttated th it was actually demon- 
e at the putative factor of 

icism formed a quantitative 
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continuum (Eysenck, 1950). Using 
biserial or tetrachoric correlations, 
he derived the criterion column by 
correlating a number of tests that 
discriminated between normals and 
neurotics, with what he called “the 
normal-neurotic dichotomy.” These 
tests had been administered to pa- 
tients at Dartford by Himmelweit, 
Desai, and Petrie (1946) and Eysenck 
subsequently factor analysed the 
data. His argument regarding the 
neurotic continuum was as follows— 
if neuroticism was a continuous vari- 
able, the group of normal patients 
taken alone should include persons 
differing in neuroticism. Persons in 
this normal group who were more 
neurotic, should obtain higher scores 
on those tests which discriminated 
more markedly between the criterion 
groups, and moreover such tests 
should, on the average, be positively 
correlated. Also, according to 
Eysenck, if neuroticism was a con- 
tinuous variable, and the tests were 
to be factor analyzed, the test which 
best discriminated normals from neu- 
rotics should have the highest load- 
ings on the factor of neuroticism and 
vice versa, and the other tests should 
have factor loadings which were pro- 
portional to the criterion values. To 
test these assumptions, Eysenck then 
performed a factor analysis, and ro- 
tated the first summation or centroid 
factor into maximum correlation 
with the criterion column, maintain- 
ing thatif the hypothesis that neurot- 
icism formed a quantitative con- 
tinuum was not substantiated by the 
data, no amount of rotation would 
succeed in giving any but chance 
correlation between the criterion 
column and the corresponding factor, 
and that a high correlation between 
the rotated factor and the criterion 
would constitute visible support for 


the assumption of a neurotic con- 
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tinuum. Eysenck did in fact obtain 
a high correlation, and maintained 
that this was a convincing demon- 
stration that neuroticism formed a 
quantitative continuum, at the one 
extreme of which were to be found 
hospitalized neurotics, while what he 
described as so-called normals” 
were to be found all the way from 
near: neurotie and neurotic” to “the 
conspicuously nonneurotic, mature, 
stable, and integrated type of person- 
ality." 

It would appear from the above 
account that Eysenck regards neurot- 
icism as unipolar, i.e., measured 
along what the writer has elsewhere 
called a unidirectional continuum in- 
volving measurement from 0 to unity 
(or, if it is preferred, infinity) and 
not along a bipolar continuum from 
—1 to +1 (Thompson, 1961). But 
although he envisages neuroticism as 
unipolar, Eysenck evidently regards 
extraversion as bipolar, for his usual 
practice is to speak of “‘extraversion- 
introversion“ (e.g., Eysenck, 19562). 
And, although Eysenck always talks 
of Conservatism-Radicalism he some- 
times uses the term Tenderminded- 
ness instead of Tendermindedness- 
Toughmindedness—is this a conve- 
nient abbreviation, or does this indi- 
cate disregard of important questions 
appertaining to polarity? The use of 
the notation R and T certainly by- 
passes any such problems if they 
exist. This prompts the writer to ask 
the following question. ‘Had Eysenck 
so wished, would it have been legiti- 
mate, in dealing with the normal pop- 
ulation only, to have altered the sys- 
tem of scoring of his tests and the 
signs of his correlations, and thus to 
have had a bipolar factor of neuroti- 
cism?” It would, in the writer's 

opinion, have been permissible, had 
Eysenck wanted to do this, in this par- 
ticular case. Using the normal group 
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only, it was immaterial whether all 
the loadings of the resulting factor of 
neuroticism were positive, or whether 
a symmetrically bipolar factor re- 
sulted from rotation—this kind of 
polarity should, the writer has sug- 
gested (Thompson, 1961), be called 
tautologous or “convertible” bipolar- 
ity. 

Eysenck’s factor analysis was, how- 
ever, performed on the normal group 
only. Supposing Eysenck had in- 
cluded the hospitalized group of 
neurotics as well as the normal group 
of patients in his factor analysis, 
what would the position have been 
then? In the writer’s opinion, the 
inclusion of the hospitalized group 
would have introduced complica- 
tions, and if there were qualitative 
differences between the normal and 
hospitalized groups, a unidirectional 
factor would no longer suffice, and it 
would have been absolutely necessary 
for neuroticism to be represented as 
a bipolar factor. This is what the 
writer means by “inconvertible bi- 
polarity." From the point of view of 
explaining the difference between 
convertible and inconvertible bipolar- 
ity, itis highly convenient that in the 
Dartford study the criterion groups 
were different in certain important re- 
spects from those used in similar 
studies reported by Eysenck. Thus 
the Dartford neurotics in the investi- 
gation undertaken by Himmelweit, 
Desai, and Petrie (1946), and whose 
data Eysenck subsequently used to 
demonstrate the existence of the 
neurotic continuum, were less severely 
neurotic than the cases used previ- 
ously, while it was found that the 
normal group were less stable than 
expected. According to Himmelweit, 
Desai, and Petrie (1946), 

Dr. Maxwell Jones, who was in charge of the 


Dartford patients and had previously worke 
at Mill Hill, reported that the patients were 
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less seriously ill. He found that the break- 
down of the Dartford patients was to a larger 
extent determined by exceptional environ- 
mental stress and to a lesser extent by neu- 
rotic predisposition. On the other hand the 
normal group was probably less well adjusted 
than the groups of actual soldiers who had 
previously served as controls. Hospitalisa- 
tion tends to bring out neurotic attitudes; it 
is also likely that some of the surgical com- 
plaints were the result of accident proneness 
which has been shown to be correlated with 
neuroticism (p. 176). 


In the present writer's opinion, if a 
factor analysis of the Dartford neu- 
rotics and normals taken together 
had been performed, there would still 
not have been sufficient qualitative 
differences between the two groups to 
make it a matter of importance 
whether the factor of neuroticism 
was bipolar or whether it was uni- 
directional (convertible bipolarity). 
But if, on the other hand, wounded 
and otherwise healthy soldiers were 
pooled with severe neurotics like those 
at Mill Hill in a factor analysis, a 
bipolar factor would be more than a 
matter of form; it would be obliga- 
tory. This is what is meant by incon- 
vertible bipolarity. The example of 
the Mill Hill and Dartford groups is 
notan isolated one; whenever factors 
are derived from the ordering of pref- 
ah a parallel to the qualitative 
kn rence between the healthy soldiers 
es Severe neurotics is to be found; 
ig Sa be demonstrated in the meas- 
ies D of values. Brogden (1952) 
ifle: us remarked on the qualitative 
icles tc between the extremes of 
10 ar factors measuring values (e. g., 
eality-Practicality""), observing 
ma die who score highly on 
(SH Wi have characteristics. for 
95 bee are, properly speaking, 
PA erparts in those scoring low, 
95 e versa. There is in an esthetic 
1 he politico-economic factor, 
Allport poat be derived from the 
Vernon Study of Values, 
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nothing that concerns politics or 
economics on the esthetic side of 
the factor; such a factor marks a 
separation between opposites that are 
not only on a continuum but are also 
qualitatively different, and the rela- 
tionship involved is more complex 
than in a unidirectional or only 
nominally bipolar factor which repre- 
sents, let us say, “frequency of eye- 
blink.” 

The distinction between the two 
kinds of polarity may, it is now sug- 
gested, be of particular importance in 
politics. As far as their party pro- 
grams are concerned and probably 
more generally than is realized, poli- 
ticians of different complexions do not 
just have different attitudes to the 
same things; they are largely con- 
cerned with incompatible aims and 
objects, and are not even interested 
in many of the things which are of 
considerable concern to their oppo- 
nents. A comparison of the programs 
of any two rival political parties 
should be sufficient to demonstrate 
this. Unless exceptional circum- 
stances prevail, purely political ide- 
ologies are "exclusive," not ‘‘collu- 
sive," and to represent political con- 
servatism-radicalism as unidirec- 
tional would be to assume a coalition 
where none exists. It is interesting 
to note in this connection that 
Cohen (1951) feels strongly that no 
one, especially a person engaged in 
politics, should be allowed to state 
his own case or to criticize his op- 
ponent until he has satisfied him and 
any others present that he has 
grasped his opponent's point of view 
(Cohen, 1958, p. 152). Although at 
first sight this appears an admirable 
suggestion, there is another side to 
the picture, and Cohen's suggestion, 
if carried to excess, might destroy the 
inconvertible polarity which the pres- 
ent writer considers is an essential 
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feature of all value judgments, and 
make it impossible to pass any criti- 
cism whatsoever. In any event, as 
long as politicians eschew role re- 
versal, and behave as they do at 
present, the fact that political ideol- 
ogies are exclusive and not collusive 
must be taken into consideration 
when factor analyzing political atti- 
tudes. 

Having called attention to the ex- 
istence of two kinds of polarity, 
what practical steps are factor ana- 
lysts to take to differentiate between 
them? Menger (1959) has pointed 
out, in connection with measurement 
theory, that although mathematical 
formulae may appear formidable to 
the unitiated, in actuality mathema- 
ticians are hampered by serious de- 
ficiencies in respect of notation, and 
cannot easily express certain essential 
mathematical notions. There would 
thus appear to be little hope of look- 
ing to mathematicians for an immedi- 
ate formulation of the difference be- 
tween convertible and inconvertible 
polarity—a subject with which logi- 
cians are more familiar than mathe- 
maticians—as the latter can hardly 
cope with their existing difficulties. 
A mathematical formulation should, 
however, remain the long-term objec- 
tive. In the meantime, it is suggested 
one method of distinguishing between 
the two kinds of polarity in factor 
analysis might be by way of judg- 
mental rotation, i.e., by treating in- 
convertibly bipolar factors such as 
those for values and political atti- 
tudes in a different way from other 
factors. Such difference in treatment 
would concern the manner in which 
individual factors were spread over 
"tests." It is suggested that in rotat- 
ing inconvertibly bipolar factors the 

object should be to concentrate the 
loadings at the extremities and to 
leave a transition zone in the middle, 
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whereas in other factors (i.e., those in 
which all the loadings are positive, 
or in which bipolarity is merely 
nominal) the aim should be even 
spacing all along the continuum (to 
have as full a range of loadings as 
possible). But quite apart from this 
question of polarity the writer thinks 
that more attention should be paid 
to loadings within individual factors, 
that too much regard is paid to the 
proportion contributed by factors to 
the total variance, and that an ex- 
cessive emphasis is placed upon the 
general pattern or configuration of 
factors taken together. However, the 
proposed suggestions are not intended 
to have absolute priority, and the 
size of factor loadings and their spac- 
ing are, it is realized, bound to be 
determined in part by the influence 
of the remaining factors. Nor is it 
intended to convey that it is a matter 
of great importance, if loadings are 
irregularly dispersed or vary greatly 
in the case of factors embracing only 
a few tests. What is suggested, is 
that when inconvertibly bipolar fac- 
tors embrace a large number of tests, 
the concentration of loadings at the 
extremities of the factor (for which 
the term “antithetical concentration" 
is now suggested) may prove decisive 
in choosing between alternative, and 
otherwise equally acceptable, rota- 
tions, and that what is suggested 
should be called “even spacing" i.e 
the uniform dispersal of loadings over 
the continuum may prove similarly 
decisive in the case of factors which 
are not inconvertibly bipolar. 

In respect of antithetical concen- 
tration and even spacing, a “ready- 
reckoner" in the form of a table or 
nomogram might be of use to factor 
analysts to indicate whether one rota- 
tion was preferable to another. An 
investigator might want to know, in 
the case of an inconvertibly bipolar 
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factor, whether, e.g., loadings of —.9 
—.9 —.8 —.7 and +.5 +.7 4.8 +.8 
were to be preferred in respect of anti- 
thetical concentration to loadings of 
—.9 —.7 —.4 —.3and 4.9 +.9 +.9 
+.9, or whether in the case of a uni- 
directional factor loadings of .9 .8 .5 
5 .2 were preferable in respect of even 
spacing to .8 .8 .7 .4 .2. (It would 
sometimes not be easy to come to a 
decision on the basis of inspection.) 
The construction of a ready-reckoner 
would not necessarily involve difficult 
computational procedures; the simple 
averaging of appropriate sums or dif- 
ferences might suffice, the situation 
resembling in this respect, that in 
which L. W. Ferguson (1944), in 
constructing his basic attitude scales 
for Religionism and Humanitarian- 
ism discovered that the rounded-off 
correlation coefficients between items 
and the attitude continuum yielded 
item weights just as satisfactory as 
those derived from more complicated 
procedures. But, bearing in mind 
that only in a superficial sense is a 
factor loading of .6 twice that of a 
loading of .3, it might be necessary 
for the accurate calculation of indices 
of antithetical concentration and 
even spacing to construct a nomo- 
gram on more elaborate principles, 
based on the rate at which the coeffi- 
cients drop off from extreme to lower 
values. 

One reason why it might be worth- 
while to construct a nomogram for in- 
Vestigators is that in the past insuffi- 
cient appreciation of problems of 
peony has apparently led to con- 
a sion, and in particular, may lie at 
ar root of the misunderstanding be- 

een Eysenck and other investi- 
Sators concerning Eysenck's T, re- 
5 in the Psychological Bulletin 
1888 m 1956a, 1956b; Eysenck, 
E 1956c; Rokeach & Hanley, 

à, 1956b). In the course of this 
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controversy, criticisms were made 
regarding Eysenck's scoring, and 
also regarding his contention that 
his failure, after exhaustive attempts, 
to find more than a few items with 
high loadings on T and no other 
factor, could be construed as support 
for the view that T was difficult to 
define and was probably closely re- 
lated to extraversion-introversion act- 
ing in conjunction with R. 

An area in which the use of an 
index of antithetical concentration 
might have special applications is 
that of Q technique, where factors 
are defined by clusters of subjects 
who have, e.g., a similar personality 
make up. Here it should be possible 
to compare factor rotations in which 
the antithetical concentration was 
maximised with respect to different 
persons or groups. Given the right 
kind of data, it would in this way be 
possible to provide a flexible measure 
of group interaction, and to narrow 
the gap between factor analysis and 
field theory. A further possibility 
might be the production of asym- 
metrically bipolar factors with more 
loadings on one side of a neutral point 
than on the other corresponding to 
the bias associated with taking the 
data relating to a given group as 
factorially neutral; the same effect 
could be studied in R technique by 
taking a particular test or group of 
tests as a neutral centre of factorial 
reference. From developments such 
as these a technique of psychometric 
relativity could emerge, which, al- 
though having a resemblance to field 
theory, would remain à factorial 
technique in affinity with Stephen- 
son's doctrine of the concreteness of 
inferential behavior." 


CONCLUSIONS 


"There is still room both for judg- 
mental and objective methods of 
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rotation and the two techniques, al- 
though separate, should be comple- 
mentary. In objective methods of 
rotation, e.g., simple structure, clus- 
ter analysis, and proportional pro- 
files, consideration is given to the 
arrangement of all the factors to- 
gether, but insufficient regard is paid 
to the distribution and size of load- 
ings within individual factors. The 
terms ‘‘antithetical concentration” 
and “even spacing" are suggested 
for use with inconvertibly bipolar 
and other factors, respectively. It is 
hoped in this connection that these 
concepts will lead to greater dis- 
crimination in the analysis of differ- 
ent types of data and that they will 
prove useful principles for factor ro- 
tation. It is also hoped that the dis- 
tinction between the two kinds of 
polarity will help to clear up past mis- 
understandings regarding Eysenck's 
criterion analysis, a form of factor 
analysis which could with advantage 


be adapted and used more exten- 
sively. This is also true of rotation in 
accordance with principles suggested 
by Stephenson. The writer admits 
the uses of simple structure, but is 
also in sympathy with Stephenson's 
tendency to emphasize the particular, 
and takes the view that the technique 
of rotation to be employed should de- 
pend upon the nature of the data, as 
well as on the aims of the investi- 
gator. There is thus room both for 
mathematically exact solutions and 
for judgmental rotation. With the 
development of electronic computers 
the present time is especially favor- 
able for developing exact mathe- 
matical solutions, and these will be- 
come increasingly available so the 
popularity of objective methods of 
factor rotation is thus assured, but 
care is needed to ensure that judg- 
mental methods are not forgotten 
and that their potential use in new 
applications is also recognized. 
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Stylistic consistencies in subjects' 
responses to personality tests (Jack- 
son & Messick, 1958) are mediated 
by the form in which the stimulus 
materials, or items, are presented to 
the subject. The limitations on re- 
sponse alternatives imposed by the 
test format do not create stylistic 
consistencies in subjects—rather they 
define the sample of stylistic re- 
sponses to be measured on any oc- 
casion. Nevertheless, there are some 
response measures that are so closely 
related to the structure of the instru- 
ment that they appear to tell us 
more about the measurement method 
than they do about the subjects. 
Consequently, investigations of re- 
sponse styles may lead to the detec- 
tion of stylistic consistencies which 
are personality traits in their own 
right or they may uncover sources of 
method variance that increase our 
understanding of the instruments 
themselves. An additional considera- 
tion with psychiatric inventories, 
such as the MMPI, is the assessment 
strategy of criterion group compar- 
ison. This particular scale construc- 
tion design employs a statistical 
definition of personality dimensions 
which has profound implications for 
the kinds of styles that may be 
sampled with the instrument. 

The present paper draws a distinc- 
tion among three sources of variance 
that contribute to assessment with 
the MMPI. These will be termed: 
strategic, method, and stylistic var- 
iance, respectively. By strategic var- 
iance is meant variation in test 
scores that may be attributed to the 
overall strategy of constructing scales 


224 


to discriminate between c-iterion 
groups and a normative population. 
The major dimension of variance 
here is variance in relative com- 
munality between any given subject 
and a predetermined normative group. 
For selection purposes, it is conven- 
ient to find dimensions of noncom- 
munality begween criterion group 
members and the normative group 
such that any individual may be 
classified as being more or less like 
the criterion group or the normative 
group on this particular predictive 
dimension. 

The particular souce of method 
variance at issue with inventories 
such as the MMPI is the employ- 
ment of ''true-false" or '"'agree-dis- 
agree" item response options which 
act as constraints on assessment 
through communality concepts. That 
is, although the statistics of criterion 
group discrimination demand a stra- 
tegy for estimating individuals' com- 
munality on a given dimension, the 
method whereby this is accomplished 
with the MMPI is one in which an 
individual may indicate communality 
or noncommunality by answering 
"true" or "false" as the case is 
determined by item phrasing. Al- 
though this limitation on item re- 
sponse may permit assessment of 
general stylistic tendencies to "agree" 
or "disagree," which may themselves 
be of predictive significance, we are 
referring here to the idiosyncratic 
nature of the particular item pool em- 


* Portions of this paper were read at the 
Conference on Personality Measurement, 
October 13, 1960, Educational Testing Serv- 
ice, Princeton, New Jersey. 
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ployed with a given instrument. This 
would include such characteristics of 
the item pool as the proportion of 
true and false keyings that exists 
among the items and the distribution 
of relative communality or “item 
popularity" values within the par- 
ticular item pool. 

By stylistic variance is meant cer- 
tain characteristic response consist- 
encies on the part of subjects which 
may be shown to exist relatively in- 
dependently of the test itself, but 
whose detection is to a large extent 
limited by the appropriateness of a 
given test as a stimulus for these con- 
sistencies to emerge. The specific 
stylistic dimension considered here 
is that of “acquiescence versus cau- 
tousness," This unidimensional re- 
sponse style is here given a more nar- 
tow definition than is usually the 
caseand an attempt is made to clearly 
differentiate it from both strategic 
and method sources of variation. It 
will be argued that although this 
stylistic tendency may be considered 
as important on other grounds, its 
Operation within the context of the 
me has generally been obscured 

y the operations of both strategic 


and method variance in determining 
test scores, 


ITEM CoMMUNALITY AND THE 
SrRATEGY OF CRITERION 
GROUP ASSESSMENT 


| rial that display strong modal 
nw encies for one of the response 
Ain In a normative group have a 
aoe of communality. That is, 
orm ormative group is relatively uni- 
bt Ss its endorsement or rejection 
18 item. Because of the wide- 
1 Preference among constructors 
dus Sie inventories for the sim- 
3 of item weighting (1 or 
uu is implicit "intensity" para- 
ioa 1$ seldom systematically taken 
count (Guilford, 1954). Never- 
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theless, it is recognized that very 
high degrees of communality impose 
limits on the potential usefulness of 
an item in the strategy of criterion 
group comparison (Berg, 1961; Sech- 
rest & Jackson, 1960). Similarly, 
those who favor the utilization of so- 
called ‘‘subtle-0” items (Meehl, 1945) 
would recognize that high communal- 
ity would impose an upper limit since 
this type of item requires that the 
criterion group communality exceed 
that of the normative group. More 
important, perhaps, is the possible 
psychological (trait) significance of 
an "unpopular" or noncommunality 
response to an item of high com- 
munality as opposed to an unpopular 
response to an item of medium com- 
munality. Within the MMPI this is 
best illustrated by the F scale which 
is, in essence, a noncommunality 
scale consisting of items of very high 
communality value keyed in the 
direction of unpopularity. Answering 
many more than 4 or 5 of these 64 
items in the keyed direction might 
seriously jeopardize a respondent's 
outpatient status (Dahlstrom & 
Welsh, 1960). 

It has become customary to refer 
to items of intermediate or medium 
communality values as items of high 
"controversiality" (Fricke, 1957; 
Hanley, 1957). These are items that 
are endorsed by about half of a nor- 
mative group and answered false by 
the other half. The limits of this 
range have been generally set at be- 
tween 40 and 60% endorsement 
(Fricke, 1957). Although this con- 
cept is defined by reference to a pat- 
tern of group responses, the term is 
used as if individual response ten- 
dencies were involved. Thus, when 
these items are referred to as in- 
different, the implication is that 
individuals within the group are in- 
different to that item, rather than 
that there are two distinct opinions 
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held by two groups of equal size. 
This usage seems justified, within 
limits, to the extent that items of 
high controversiality tend to be un- 
stable on test-retest (Fiske & Rice, 
1955) and that they tend to be 
rated as being of neutral social desir- 
ability value (Hanley, 1957). What 
has not been emphasized is the fact 
that the concept of controversiality 
is used with reference to the respon- 
ses of normative groups. Items which 
are controversial for one group may 
not be so for another and, indeed, if 
this were not so, such items would 
make no contribution to the dis- 
crimination of nonnormal groups. 

As has been observed by several 
writers (Berg, 1961; Wiggins & 
Rumrill, 1959) the scaling procedures 
used by Edwards (1953) to obtain 
rated “social desirability” values for 
inventory items may be considered as 
indirect assessments of item com- 
munality values. In this procedure, 
judges are given a pool of items anda 
set of instructions that requires them 
to estimate the degree of desirability 
of an endorsement of the item by 
others (Edwards, 1957). Studies to 
date have employed five-point (Heine- 
man, 1952 unpublished), seven-point 
(Cowen & Stiller, 1959; Wiggins & 
Rumrill, 1959), and nine-point (Ed- 
wards, 1957; Hanley, 1956) interval 
scales. These ratings are obtained 
independently of the communality 
value of the items, although, as one 
might expect, substantial correla- 
tions have been reported between 
endorsement frequencies and rated 
social desirability scale values in cer- 
tain item pools (Edwards, 1957). Al- 
though the prediction of communality 
values from social desirability ratings 
has been interpreted as having pro- 

found and somewhat sinister implica- 
tions for inventory assessment (Ed- 
wards, 1957), the equally plausible 
prediction of social desirability rat- 
ings from communality values is here 


WIGGINS 


considered to stem naturally from the 
function of item communality in 
criterion group strategy. 

The use of psychiatric criterion 
groups in the development of the 
MMPI clinical scales reflected the 
intent of the test authors (Hathaway 
& McKinley, 1951) to develop indices 
of the extent to which a given indi- 
vidual could be said to possess traits 
or test taking attitudes, or both, 
which are more typical of a psychi- 
atric group than they are of the popu- 
lation at large. The middle range of 
such scales may be thought of as re- 
flecting the degree of communality 
an individual possesses with the 
Minnesota normals on whatever trait 
or set of circumstances is involved in 
the scale. This middle range of the 
MMPI clinical scales has never been 
seriously proposed as a definitive cri- 
terion of “normality” nor have very 
fine discriminations been expected of 
it. We would expect a certain amount 
of "conformity" to be operative here 
in the statistical sense that the ma- 
jority of normals will endorse what 
is considered to be the acceptable re- 
sponse by the majority of normals. 
This tendency toward conformity be- 
comes of special interest when it 
appears in the exaggerated form of 
“hypercommunality.” That is, when 
the individual’s responses become 
determined, almost exclusively, by 
efforts to match anticipated cultural 
patterns of acceptability, rather than 
by honest assessment of his own posi- 
tion on the trait continuum. 

We are speaking then of the three 
broad categories into which an indi- 
vidual may be classified on the basis 
of his score on a single criterion- 
derived scale: he may score high on 
the scale, in the middle range of the 
scale, or at the lower end. Individuals 
Scoring high on the scale exhibit a 
noncommunality with the normative 
group and by implication are more 
like the criterion group on whatever 
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trait is measured by the scale. In 
terms of Berg's Deviation Hypothe- 
sis (Berg, 1955, 1957, 1961) these 
individuals may be deviant in both a 
"relative" sense (in terms of differing 
from normative modal tendencies) 
and in an "absolute" sense (giving 
responses typical of a group defined 
to be deviant on other grounds). 
Individuals who attain scores in the 
middle range of the scale may be 
thought of as being more similar to 
the normative group than to the 
criterion group on the variable in 
question. Since the normative group 
is usually defined by default, with 
respect to psychiatric variables, there 
is considerably less information con- 
veyed by such a middle-range score. 

It is a peculiar property of criterion 
derived scales that individuals scor- 
ing very low resemble the normative 
group but not any of its individual 
members. The possible significance 
of this category of scorers is recog- 
nized by Sechrest and Jackson (1960) 
who speak of the '"deviantly non- 
deviant"—i.e., scorers who adhere to 
the modal responses more consistent- 
ly than do members of the norma- 
T group. The present concept of 
BE aliy refers to this same 
8 gory of low Scorers and, in addi- 
5 calls attention to the fact that 
5 logically include those indi- 
s s whose responses are chiefly 
5 by considerations of social 
5 1 M or exaggerated attempts 
1 Pond in the direction of highest 

imated item communality. 


Communatiry AND SOCIAL 
DESIRABILITY 


During the past few years, a num- 
erde individual difference measures 
ability pone style of social desir- 
duse nr appeared in the litera- 
plieitl ich have, explicitly or im- 
dinto employed the concepts of 
TER ersiality, communality, and 

Social desirability (see Table 1). 
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Hanley (1961) has recently suggested 
a classification scheme for social de- 
sirability scales based on whether re- 
sponse frequencies played a role in 
item selection and whether the social 
desirability of the items was deter- 
mined explicitly or implicitly. An- 
other basis for classifying social de- 
sirability scales, which has been 
stressed by the present writer (Wig- 
gins, 1959), is whether the response 
frequencies of the items were con- 
trasted with those of another group. 
In the method of contrasted groups, 
the responses of a group of subjects, 
who because of special instructions 
or special circumstances are con- 
sidered to be a group of high social 
desirability respondents, are com- 
pared with the responses of a control 
group. 

The method of contrasted groups 
was employed by both Cofer, Chance, 
and Judson (1949) and the present 
writer (Wiggins, 1959) in the develop- 
ment of two role playing scales of 
social desirability. The responses of 
a control group were contrasted with 
those of subjects instructed to answer 
the MMPI in terms of social desir- 
ability. In the construction of the L 
scale of the MMPI (Meehl & Hath- 
away, 1946), a group of two clinicians 
constructed items in terms of social 
desirability and guessed at the fre- 
quencies that would occur in a con- 
trol group. Two groups were con- 
trasted in spirit, if not in practice. 
The K scale of the MMPI (Meehl & 
Hathaway, 1946) was constructed by 
comparing the responses of one group 
of presumably nonfaking patients 
with those of another group thought 
to include a large number of social 
desirability respondents.” 


2 Unfortunately, the interpretation of the 
K scale is further complicated by the fact 
that an additional set of items was added 
to the scale that had been shown not to 
discriminate between role playing and control 


college groups. 
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TABLE 1 
Social. DgsigABILITY SCALES 


SD: Ten judges were instructed to answer 
149 items from L, F, K, and MAS scales in 
such a way as to give the most socially de- 
sirable picture of themselves. Unanimous 
agreement on 79 items which were reduced 
to 39 items by item analysis. (Edwards, 
1957) 

Sd-A, Sd-R: Items from Welsh's (1956) Fac- 
tor Scales A (39 items) and R (40 items) 
were rated for both ''true" and "false" 
responses on a 7-point scale by a total of 
181 judges. Sd-A is a pool of low rated 
items and Sd-R is a pool of moderate items. 
(Wiggins & Rumrill, 1959) 

K: Twenty-two items which differentiated 
patients with high L scores and normal 
MMPI profiles from a comparable group 
of patients with abnormal profiles plus 
eight items which remained unchanged 
under role playing instructions in normals 
and also differentiated severe disturbance 
from normality. (Mech! & Hathaway, 1946) 

Ex: Fifty-three items of high controver- 
siality (Hathaway norms) were rated by 92 

on a 9-point scale of social de- 
sirability. Twenty-six high and low rated 
items constitute scale. (Hanley, 1957) 

Cof: Thirty-four items which were unchanged 
under fake-bad instructions but changed 
E Re ceret y: in a role play- 

y utilizing 81 subjects. (Cofer 
et al., 1949) (rc: 

Sd: Records of 178 social desirability role 
players were contrasted with 140 controls 
A Eo 40 differentiating items. (Wiggins, 

L: Two judges made up 15 socially undesir- 
able items that they felt would be fre- 
quently endorsed by normals. (Meehl & 
Hathaway, 1946) 

Sx: Eight items were eliminated from Ex (see 
above) to make a balanced scale of 9 
"true" and 9 “false” items. (Hanley, 1957) 

TSD: Using Heineman's data (1952) on 
S- point favorability ratings for all MMPI 
items by 108 subjects, all items less than 
2.5 and greater than 3.5 were keyed for 
favorability. (Wiggins, 1961) 

ESD: Using Heineman's data (1952), the 39 
items of extreme“ favorability value (less 
than 1.5 or greater than 4.5) were keyed 
for social desirability. (Wiggins, 1961) 


Hanley (1957) has developed a 
scale with reference to two groups 
that were not directly contrasted 
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with one another. Item controversi- 
ality was determined from the Min- 
nesota college norms (Dahlstrom & 
Welsh, 1960) and controversial items 
were given to another group to deter- 
mine their rated social desirability 
values. Items of high and low rated 
value were retained and, on rational 
grounds, it might be predicted that 
such items would be answered in the 
keyed direction more frequently by 
social desirability respondents than 
by an honest group. In the social 
desirability scale developed by Ed- 
wards (1957), the response frequen- 
cies of a single group were sufficient 
to determine item inclusion. The 
unanimous agreement of 10 role 
playing judges determined the direc- 
tion of item keying without reference 
to a control group. Similarly, the 
social desirability scales constructed 
from judges’ ratings of the desir- 
ability of items from Welsh’s A and 
R scales (Wiggins & Rumrill, 1959) 
were based ona single group of judges. 

Table 2 presents data on four char- 
acteristics of social desirability scales 
that are relevant to our discussion of 
communality and rated social desir- 
ability. These characteristics are: 
communality of items in the scale, 
rated social desirability of items in 
the scale, endorsement versus rated 
social desirability of items in the scale, 
and the success achieved by the scale 
in identifying subjects instructed to 
answer the MMPI in terms of social 
desirability. 

Scale communality is defined as the 
average proportion of subjects, in 4 
normative group, who answered the 
items in the direction in which the 
scale is keyed (social desirability). 
The communality values in Row 
of Table 2 were computed from item 
frequency counts of the records of 140 
Stanford students (55 men, 85 Wo- 
men). The social desirability scales 
have been ordered, in terms of their 
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s 
- $ Qne-hundred and forty college subjects. 


Fifty college raters. 
Toup a versus Group b. 


T (Wiggins, 1959). 


f average communality values, from 
: high (on the left) to low (on the 
right). In the bottom row of the 
same table are the phi- coefficients re- 
- Ported in a study of the predictive 
efficiency of social desirability scales 
E separating a group of 250 social 
, occid role players from a group 
or 190 controls (Wiggins, 1959). It 
E 1 seen that the average com- 
A one ity value of a scale places a 
E limitation on the extent to 
. "lich it is sensitive to the com- 
Munality shifts produced by role 
e instructions. Thus, with a 

ee such as Edwards’ SD, in which 
Ee scale communality is 
erde 0%, even slight shifts in the 
E. n. of social desirability would 

E the limit of maximum com- 
Es à IT The social desirability 
ES. ith lower average communal- 
ES on the other hand, are 
ee a y sensitive to a much wider 
Bus communality increases and 
in E E une other things, is reflected 
ne "e success of these scales 
Players g social desirability role 


TABLE 2 
CHARACTERISTICS OF SOCIAL DESIRABILITY SCALES 
———— — - 
SD Sd-A Sa- R K Ex Cof Sd L 
Communality 
mean .79 .66 .55 53 50 35 .30 20 
0-4 (.13) (. 18) (.24) — (.22). C19) — (17) C10) C116) 
Social desirability 
mean 5.70 5.47 5.20 4.65 5.23 4.69 4.72 4.39 
o^ (.50) (65)  (.24) (1.20) (62) (66) (.74) (.84) 
Endorsement versus 
social desirability 
r .91 .55 .56 46 .05 —.33 —.17 .58 
Bereening efficiency 
phi .330 .386 .395 .217 461 619 721 .539 


The average scale social desirabil- 
ity values given in thesecond row rep- 
resent the averaged median ratings 
for all items in the scale when 
answered in the keyed direction 
(social desirability). Ratings of the 
item pool that contained all the social 
desirability scales were obtained from 
50 college students (24 men, 26 
women) who rated the social desir- 
ability of a true answer on a seven- 
point scale. Since the ratings were 
for the desirability of a true answer, 
it was necessary to reflect the ratings 
for items keyed false in the i 
desirability scales.“ When these 
median ratings and reflected median 
ratings are averaged across a given 
scale, an index of the intensity of a 
social desirability scale is provided. 
A close correspondence can be seen 
to exist between the communality 
values and the social desirability 


3 This reflection must be considered as only 
an approximation to the rated social de- 
sirability of answering false since previous 
research (Wiggins & Rumrill, 1959) suggests 
that empirical values will often differ from 


reflected values. 
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values which range from “‘moderately 
desirable" in SD to “neither desir- 
able nor undesirable" in the L scale. 
Despite this apparent variation with 
communality values, the actual range 
of social desirability values is quite 
small, all scales being within 1.31 
points of each other. The com- 
munality values, on the other hand, 
cover a range of 59 percentage points. 
The third row presents the correla- 
tion between the rated social desir- 
ability of a true answer and the num- 
ber of subjects who actually endorsed 
the item in an independent group of 
140 college men and women. This 
was calculated in the now standard 
method for estimating the extent to 
which social desirability considera- 
tions influence responses to items in 
a given scale (Edwards, 1957). It 
can be seen that despite the limited 
variability in average social desir- 
ability value, there is considerable 
variation among the scales in the re- 
lationship between endorsement and 
rated desirability. This ranges from 
a near-zero correlation in Hanley's 
Ex to a .91 in Edwards' scale. It can 
also be seen that as the communal- 
ities approach controversiality, there 
is a corresponding decrease in the en- 
dorsement-favorability relationship 
and that as the communalities drop 
below 50%, the relationship becomes 
negative—with the notable exception 
of the L scale. It should be recalled 
that the rationale behind Hanley's 
Ex scale was such that he predicted a 
near-zero correlation would exist in 
an “honest” population (Hanley, 
1957). To the extent that this popu- 
lation of students is honest, we must 
conclude that the pool of high com- 
munality items that constitute Ed- 
wards' SD would tend to make us 
suspicious of almost everyone. 
All this, of course, raises the gen- 
eral issue of what properly constitutes 
an individual difference measure of 
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social desirability and what would be 
the expected behavior of such a 
measure under role playing instruc- 
tions? Edwards is quite explicit 
about this in his monograph on 
the social desirability variable (1957). 
It is his contention that since social 
desirability is a relatively all-perva- 
sive influence in determining inven- 
tory responses, the majority of nor- 
mals should be considered as possess- 
ing a substantial amount of the trait 
to begin with. Under social desir- 
ability instructions, he reasons, scores 
of the minority who are usually un- 
influenced by social desirability will 
shift in the direction of the majority 
whose faking scores remain un- 
changed, and all scores become more 
homogeneous (Edwards, 1957, pp. 
55-56). 

In the role playing study already 
mentioned (Wiggins, 1959), the dis- 
simulation measures (including that 
of Edwards) exhibited increased, 
rather than decreased variability 
under special instructions. More im- 
portant than this, however, is the 
actual distribution of scales under 
the two sets of instructions. Figure 
1 presents these distributions for 
two scales of markedly different 
effectivenessin differentiating the two 
instructional groups. The upper part 
of the figure shows the distribution 
of the writer's Sd under the two con- 
ditions. This scale, of course, was 
more or less “custom-made” for this 
type of discrimination and it is not 
surprising that satisfactory separa- 
tion of the two groups is achieved. 
Under standard instructions, the Sd 
scale is seen to be relatively normally 
distributed at the lower end of the 
possible range of scores. This reflects, 
among other things, the low com- 
munality values of the items in the 
Scale. Under fake instructions, it can 
also be seen that the faking grouP 
takes full advantage of the room for 
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ike 5 by distributing itself along 
4 Ses length of the scale—with 
9 bs Score at a comfortable dis- 
The 875 that of the control group. 
bottom dep of Edwards, at the 
ane the figure, presents a 
trol grow 5 5 picture. The con- 
show a 55 oes, as Edwards predicts, 
A 0 aa piling up at the upper 
anticipat : Scale and this would be 
munalit ed by the high average com- 
1 value of the scale itself. 
ole playing instructions, shifts 


in the direction of hypercommunality 
appear to be limited by the already 
high communality value of the scale. 

It should be emphasized that the 
role playing experiment is not being 
espoused as the ultimate criterion for 
social desirability scales. It is readily 
conceded that there is a certain 
amount of artificiality to such a pro- 
cedure that limits its generalization 
(Hanley, 1961). However, if social 
desirability is conceived of as a tend- 
ency toward communality which at 
its extreme end exceeds the com- 
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munality of the average group mem- 
ber, then social desirability scales 
must themselves be of sufficiently low 
communality value to allow for detec- 
tion of this extreme. If we take a 
pool of items of high communality 
value scored in the direction of social 
desirability, and administer them toa 
standard group and a role playing 
group, we would not expect large 
shifts to occur under the two condi- 
tions. However, we should not inter- 
pret this to mean that the instruc- 
tions did not produce the style when 
there is the alternative explanation 
that the particular item pool did not 
provide sufficient occasion for the 
detection of the style in operation. 
Finally, if social desirability response 
style is considered to be a style that 
virtually everyone possesses to a rela- 
tively invariant high degree, then it 
is difficult to see its relevance to the 
study of individual differences in 
general or stylistic tendencies in par- 
ticular. 

A recent paper by Crowne and 
Marlowe (1960) presents an alterna- 
tive, although parallel, rationale for 
constructing individual difference 
measures of social desirability re- 
sponse style. These authors stress the 
improbability of college students 
possessing the deviant symptomatol- 
ogy described in Edwards' SD items 
of extreme communality values and 
argue that "it cannot be determined 
whether these responses are attribut- 
able to social desirability or to a 
genuine absence of such symptoms" 
(p. 349). For this reason, the authors 
selected items "defined by behaviors 
which are culturally sanctioned and 
approved but which are improbable 
of occurrence" (p. 350). No indica- 
tion is given of the criteria which 

guided this rational selection although 
the authors felt they were ‘‘avoiding 
the ambiguities of the statistical 
deviance approach” by their pro- 
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cedure. Evidence of their success in 
avoiding items of pathological impli- 
cations is provided by the mean item 
rating of 2.9 on a 5-point adjustment 
scale applied by 10 judges. 

That the Marlowe-Crowne Social 
Desirability scale (M-C SD scale) is 
of sufficiently moderate average com- 
munality value to be sensitive to 
shifts produced by role playing in- 
structions is suggested by its reported 
normal distribution, in a standard in- 
struction group, around a mean scale 
score involving less than half (42%) 
of the items in the scale. The authors 
contrast this distribution with that of 
Edwards’ SD which has negative skew 
around a mean scale score involving, 
on the average, some 82% of the 
items. Since only a fraction of the 
M-C SD scale items were selected 
from the MMPI, its effectiveness in 
identifying groups of high social de- 
sirability respondents cannot be eval- 
uated from existing data. 


AGREEMENT-DISAGREEMENT AS 
METHOD AND STYLISTIC 
VARIANCE 


Definitions of Acquiescence 


The term “acquiescence” is used 
throughout this paper to indicate 4 
general tendency on the part of sub- 
jects to agree with a statement or test 
item when no issue seems at stake. 
This is, perhaps, a deep-seated cul- 
tural bias to be agreeable when it 
does not cost anything, to “go along 
with the gang“ or the printed word, 
when no personal inconvenience 1$ 
attendant upon such acquiescence. 
At the other end of this postulated 
dimension is found the tendency to 
be reluctant to commit oneself to 
relatively neutral assertions—a dis- 
position to be “cautious” with respect 
to apparently innocuous issues. 

When this dimension is measure 
within the realm of personality in- 
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ventories, it is important to note that 
itrefers toresponse option preferences 
with respect to a special class of 
items—i.e., items of “high contro- 
versiality" (medium communality) 
values. Thus, the ‘‘acquiescer’’ will 
be identified by the number of times 
he selects true as a response option 
from a pool of high controversiality 
items and the cautious individual by 
the number of false responses to the 
same item pool. Attempts to assemble 
pools of high controversiality items 
from the MMPI by Fricke (1957), 
Hanley (1957), Fulkerson (1958), 
and the present writer (Wiggins, 1961) 
would lead to the conservative esti- 
mate that less than 10% of the total 
pool of MMPI items can be so classi- 
fied. In addition, as Hanley (1961) 
has pointed out, even items which 
meet this statistical criterion of in- 
difference” may harbor subtle com- 
munality variance which may be 
detected by applying social desirabil- 
ity rating procedures to medium com- 
munality items. This is not to de- 
emphasize the possible contribution 
of acquiescent variance to the MMPI 
rA rather to indicate that highly re- 
m Scaling procedures are essential 
: its unambiguous measurement. 
ee is meant the separa- 
UM : acquiescent response style 
UR oth communality variance aris- 
unn assessment strategy and 
155 : A 
USA lance peculiar to any given 
Pon m it is difficult to define acqui- 
ccm on other than statistical 
doe S, contrasted criterion groups 
ey E been employed in the de- 
Sn ent of individual difference 
1 5 of this variable (see Table 
à ee agreement on the 
e characteristics of acqui- 
bun measures does not appear to 
des ^f : Most authors would 
S Cen. n principle, that an ''acqui- 
ce scale” should be constructed 
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TABLE 3 
ACQUIESCENCE SCALES 


B: Using Hathaway's data on endorsement 
frequencies for a combined college and 
normal group, 63 items of high contro- 
versiality (40-60%) were keyed "true." 
(Fricke, 1957) 

Bn: Using groups of raters totaling 151, 
Hanley (1961) obtained 9-point desir- 
ability ratings on the 63 items from B 
above. The 32 items judged to be of 
“neutral” value (4-6) were keyed "true," 
(Wiggins, 1961) 

Rb: Eighty-four items of high controversiality 
(40-60%) in a group of 190 college sub- 
jects were rated on a 7-point scale by a 
different group of 50 subjects. The 27 
items of neutral“ value (3.5-4.5) were 
keyed “true.” (Wiggins, 1961) 

Acq: From a group of 472 aviation cadets 
who had been rated for adjustment, 46 
items of 40-60% controversiality were 
found which did not discriminate between 
adjusted and nonadjusted subjects. This 
pool was reduced to 24 items by internal 
consistency and they were keyed “true.” 
(Fulkerson, 1958) 

At: Keying was changed on half the items in 
Sx (see Table 1) to yield an 18-item all 
true" key. (Hanley, 1957) 

AT: The number of items, out of 566, that 
are answered “true” by a subject taking 
the MMPI. 

Se DORE ͤ DUUM ee 


from a heterogeneous pool of items of 
uniformly high “ambiguity” which 
are all keyed in the true direction. 
The statistical criterion of ambiguity 
(controversiality) has already been 
discussed. In an ideal normative 
sample, we would require that the ex- 
pected communality value of each 
item be .50 and that the expected 
scale score be 1/2, where n =the num- 
ber of items in the scale. : 
Fricke (1957) selected items of high 
controversiality from Hathaway's 
normative data (Dahlstrom & Welsh, 
1960) and keyed them all true as a 
measure of acquiescence—the B scale. 
Hanley (1961) obtained social desir- 
ability ratings on Fricke's B scale 
items and found an imbalance of 
socially undesirable items to exist. 
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The 32 items that were judged to be 
of neutral social desirability by 
Hanley's raters constitute a revised B 
scale (Bn) of neutral social desir- 
ability value. Similarly, the writer's 
Rb scale contains items of high con- 
troversiality selected from more com- 
plete normative data and independ- 
ently rated for the purpose of retain- 
ing items of rated neutral social desir- 
ability. Fulkerson's (1958) Acqui- 
escence scale (Acq) contains items of 
high controversiality that have been 
demonstrated not to discriminate be- 
tween deviant (maladjusted) and 
normative groups. 

The criterion for the “heterogene- 
ity" of an acquiescent item pool is 
not as clear cut. This may be be- 
cause of the implicit notion of item 
"content" which is itself replete with 
measurement difficulties (Berg, 1959). 
Cohn (1953) describes an MMPI ac- 
quiescent item pool that was sub- 
mitted to two clinicians who were in 
agreement that no particular person- 
ality syndrome was being tapped by 
the item content" (p. 335). Couch 
and Keniston (1960) using another in- 
ventory chose items from “a large 
number of heterogeneous scales [so 
that] . . a total score over all items 
makes no psychological sense" (pp. 
152-53). In developing an acqui- 
escence measure with achievement 
items, Gage, Leavitt, and Stone 
(1957) employed items that were 
"sufficiently difficult and obscure to 
elicit approximately a 50-50 split of 
"true" and "false" responses" (p. 98). 

The foregoing descriptions of ac- 
quiescence item pools imply that the 
items are not measuring common con- 
tent dimensions and that consistent 
agreement is therefore interpretable 

as acquiescence rather than as a 
content dimension. A literal interpre- 
tation of this implication would be 
that an acquiescence item pool is 
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lacking in statistical homogeneity— 
ie., has low interitem correlations. 
This line of reasoning has been pre- 
sented by Hanley (1957) in a slightly 
different context—that of the ration- 
ale for a measure of test taking defen- 
siveness. Applying Hanley's argu- 
ment to acquiescence measures, we 
would expect a pure measure of ac- 
quiescence to have zero internal con- 
sistency in an ideally nonacquiescent 
population and substantial internal 
consistency in a population com- 
posed of only acquiescent individuals. 

Cohn (1953) does not report the 
internal consistency of his MMPI 
“Plus scale" so that its lack of con- 
tent homogeneity must rest on the 
opinions of his two raters. In this 
connection, Gage et al. (1957) re- 
port that when the items from 
Cohn's Plus scale were given to four 
of their judges, there was substantial 
independent agreement that almost 
all of the items measured a “tendency 
to self-disparagement" when answered 


true which would suggest internal 


consistency on a content basis. 
Couch and Keniston (1960) seem 
to place a premium on the high in- 
ternal consistency of their measure of 
acquiescence: The first sign of the 
importance of this measure was indi- 
cated by the high (+.85) Spearman- 
Brown split-half (even-odd) reliabil- 
ity of the 360-item scale. The OAS 
thus provides a reliable measure of 
agreeing response set as defined in 
this study” (p. 153). Assuming, as 
Couch and Keniston do, that “a 
total score over all items makes no 
psychological sense" (p. 153), the 
substantial internal consistency woul 
Suggest that their sample of college 
men were a highly acquiescent group. 
However, the distribution of scale 
scores (which were expressed on à 
seven-point Likert scale) was from 3.1 
to 4.5 which means that none of the 
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subject's score fell in the “Agree- 
ment" end of the scale (i.e., greater 
than 4.5). Moreover, although the 
overall mean score of the 61 subjects 
was 3.9, less than 1% of all responses 
fell in the No Answer" (4.0) cate- 
gory and the response data suggest 
that the majority of subjects gave 
equal proportions of agreeing and dis- 
agreeing answers. 

Gage et al. (1957) are even more 
explicit in their insistence upon an 
internally consistent acquiescence 
scale. Their group of 50 difficult“ 
information items scored all true, had 
a difficulty level of approximately 
50, and a corrected split-half reli- 
ability of .68. Another group of 40 
"easy" information items, scored all 
true, had a difficulty level of approxi- 
mately .75, and a corrected split-half 
reliability of .09. The authors con- 
clude: ""That difficult and ambiguous 
items are required to elicit the ac- 
quiescence set is again demonstrated 
by the fact that the reliability of the 
acquiescence score on the easy in- 
formation items was only. Oo“ (p. 98). 
The subjects were 118 graduate stu- 
dents in education who were appar- 
ently not thought, on a priori gounds, 
to be a particularly acquiescent 
group. 

It should be apparent that the 
measurement of acquiescence pro- 
vides a psychometric paradox which 
has only been partially recognized. 

9 the extent that acquiescence may 

€ said to contribute variance to a 
Siven measure (be it criterion-valid 
9r systematic error variance) it must 
da with some acceptable degree of 
als d ility. The internal consistency 
a generality of the tendency to 
pu have been stressed from the 
1 50. 1 (Cronbach, 1942, 1946, 
vn Nevertheless, the agreement 
ae ncy itself has been defined with 

Pect to heterogeneous and ambigu- 
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ous items in such a manner that 
internal consistency in a presumed 
measure of acquiescence may itself be 
interpreted as content variance which 
would vitiate the acquiescence meas- 
ure on logical grounds. It has been 
proposed that Hanley’s (1957) ra- 
tionale for constructing a measure of 
test taking defensiveness may be ap- 
plied to acquiescence measures as a 
partial resolution of this apparent 
paradox. Here, as with social desir- 
ability, the importance of specifying 
theoretical expectations concerning 
the amount of the style to be found 
in a given group is apparent. 

Table 4 presents data on the sta- 
tistical characteristics of the MMPI 
acquiescent item pools already dis- 
cussed. These calculations are based 
on the records of 100 college subjects 
(50 men, 50 women) who took the 
MMPI as a supplemental require- 
ment of Introductory Psychology and 
are not therefore considered to be an 
especially acquiescent group. Column 
3 lists the average proportion of items 
in the total scale that were endorsed 
by this group (2 — $/n). With the ex- 
ception of the 566 "all true" scale 
(AT), which is not proposed as a 
meaningful acquiescence measure, 
none of the acquiescence measures 
depart from the expected average 
scale score value of 1/2 (p=.50). 


TABLE 4 
CHARACTERISTICS OF ACQUIESCENCE SCALES 


Acqu- No 
escence of 5 of rkR-n 1SB-eorr 
scale items 


Rb 27 .50 A2 .36 56 
3% MEETS 107 
3% TRAC AM 60 


At 18 49 .16 46 75 
B 63  .43 .10 65 .65 
AT 566 .4*.06  — — 


„Different from  =.50 at .001 level. 
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According to our previously de- 
veloped rationale, a further require- 
ment of a desirable acquiescent pool 
would be relatively low interitem cor- 
relations in this population of sub- 
jects. Since considerable care was 
taken in all of these item pools to in- 
sure that items would be of approxi- 
mately equal difficulty level, Kuder- 
Richardson Formula 21 was em- 
ployed to yield a lower-bound esti- 
mate of the internal consistency of 
the acquiescence scales. Column 5 
presents the estimated internal con- 
sistency coefficients for acquiescence 
scales based on differing numbers of 
items. Unfortunately, we are unable 
to specify with any precision the 
degree of internal consistency an ac- 
quiescence scale should possess in 
this population. From a practical 
standpoint, it would seem that the 
-36 internal consistency coefficient of 
Rb represents the least amount of 
content variance that one might ex- 

pect from an MMPI acquiescent 
item pool in a college group. The 
rather substantial internal consist- 
ency of Fricke's original B scale 
(r=.65) is almost identical with the 
-64 reported by Hanley (1961) for a 
college group and is taken as further 
confirmation of his suspicion that 
content variance is involved in this 
scale. 
The last column of Table 4 gives 
the internal consistency coefficients 
corrected by the generalized Spear- 
man-Brown formula to a common 
base of 63 items. Although this pro- 
vides a common basis for comparison 
of the various methods of obtaining 
acquiescent items, it should be re- 
called that most of the methods were 
exhaustive. The 27 items that form 
the Rb scale, for example, represent 
the only items from the original 566- 
item pool that met the specified 
criteria for inclusion. The internal 
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consistency of an Rb scale augmented 
by 36 items is therefore only of 
academic interest. 


Acquiescence and Communality 
as Dimensions of the MMPI 


There are several views on the 
dimensionality and relationship of 
agreement tendencies and communal- 
ity that are not easily reconciled. 
Edwards (1957) originally took the 
position that the tendency to respond 
in terms of perceived social desir- 
ability value (communality) was a 
single dimension* and that agree- 
ment-disagreement tendencies were 
interpretable primarily in terms of 
this same concept. The present 
definition of acquiescence makes it 
logically independent of communality 
variance. Thus, with a pool of high 
controversiality items, the number of 
true responses (acquiescence) will be 
perfectly negatively correlated with 
the number of false responses (cau- 
tiousness). However, when items of 
extreme communality are involved, 
there is no necessary relationship be- 
tween the proportion of ‘‘deviant 
true" responses and the proportion 
of “deviant false" since they are not 
required to sum to unity. Evidence 
that there is, in fact, no relationship 
would be Barnes (1956b) finding 
that Deviant True and Deviant False 
are uncorrelated in the total MMPI 
pool and the recent report (Jackson & 
Messick, 1961) that Deviant True 
and Deviant False items are negligibly 
correlated (r—.13) in the F scale. 

Jackson and Messick (1958, 1961) 
have consistently called attention to 
the possibility of acquiescence being 
elicited differentially by item pools 
of differing levels of rated social de- 
sirability. Their most recent formula- 


* Messick (1960) has recently cast serious 
doubt on the undimensional character of 
rated social desirability values. 
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tion (Jackson & Messick, 1961) of 
social desirability (communality) and 
acquiescence as orthogonal factors 
underlying MMPI scales is compat- 
ible with our assertion that there is 
no logical reason that these two meas- 
ures should be related. Indeed, this 
factorial study provides the most 
direct evidence available for the inde- 
pendence of these two measures 
since their marker variable for acqui- 
escence meets our restricted defini- 
tion of this response style (Jackson 
& Messick, 1961). 

Ideally, if the number of true and 
false deviant answers were neatly 
balanced in each MMPI scale, it 
would be possible to speak of rela- 
tively "pure" measures of noncom- 
munality with respect to certain 
criterion group variables. That this 
State of affairs is unlikely, however, 
is indicated by the fact that 63% of 
the items in the MMPI reflect a lack 
of communality with the population 
at large when answered true while 
only 37% of the items indicate a 
divergence from population norms 
when answered false. On this basis 
alone, we would expect a strong 
E true bias to be present in the 
EU clinical scales and, indeed, 
ene and Jackson (1961) have 
E emonstrated that the strongest 
in E consistently isolated factor 
B. PI factors studies is interpret- 
thou 92 a deviant true factor. Al- 
Ec 5 and Jackson (1961) 
the of this factor as acquiescence, 
t E are not using the term here in 
Ec restricted sense of the 
to be € would consider this factor 
style OT: of acquiescent 
20 a method (test) that em- 

ys an unbalanced item pool. 

Bot M ponente of the Deviation Hy- 
m is (Barnes, 1956b; Berg, 1961) 

d Postulated response biases“ 

include both agreement tenden- 
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cies and tendencies to answer devi- 
antly (in the direction of noncom- 
munality). The most convincing 
evidence for the existence of these 
two biases come from Barnes’ (1956 b) 
demonstration that the tendency to 
answer deviantly true on the MMPI 
is an impressive predictor of the 
“psychotic scales" while the tendency 
to answer deviantly false is sub- 
stantially related to the MMPI 
“neurotic scales." Additional evi- 
dence of the importance of these 
postulated deviant response tenden- 
cies is found in Barnes’ (1956a) later 
observation that Deviant True ap- 
pears to be a pure factor test“ of 
Wheeler, Little, and Lehner’s Factor 
I (1951) on the MMPI, while Deviant 
False is heavily loaded on Factor II 
of the same study. 

Even if it is conceded that Deviant 
True and Deviant False appear to 
account for much of the variance in 
the MMPI, this does not commit one 
to the position that there are two 
independent "deviant response styles" 
in operation. These gross measures 
would seem to include all the sources 
of variation we have defined thus far, 
namely: variance in relative com- 
munality arising from the strategy of 
criterion groups, variance attribut- 
able to the postulated stylistic dimen- 
sion of acquiescence-cautiousness, and 
variance attributable to the partic- 
ular item pool represented by this 
test with its relative imbalance (63%) 
of low communality items keyed true. 

In order to investigate therelation- 
ship of Deviant True and Deviant 
Falsetothese other postulated sources 
of variation, it was thought neces- 
sary to replicate Barnes' (1956b) pro- 
cedure in a slightly larger and more 
homogeneous population. In addi- 
tion, we were interested in the rela- 
tionships of the individual difference 
measures of social desirability (com- 
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munality) and acquiescence to these 
deviance dimensions as well as the 
relationships of several of the MMPI 
clinical and special scales which are 
relevant to the present argument. 


Tue DEVIATION HYPOTHESIS AND 
THE MMPI 


With the foregoing in mind, a 356- 
item Deviant True key and a 210- 
item Deviant False key were con- 
structed in a manner analogous to the 
scoring method used by Barnes 
(1956b) with individual record forms. 
'The item scoring direction or direc- 
tion of noncommunality for norma- 
tive groups is given by the authors 
of the MMPI for all items (Hathaway 
& McKinley, 1951). Items were 
separated into Deviant True and 
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Deviant False groups on this basis. 

When scores from these two keys 
were correlated in our sample of 100 
college subjects (50 men and 50 wom- 
en), the correlation between them 
was found to be .001. This striking 
result fully supports Barnes' (1956b) 
contention that the two measures are 
independent of each other. In addi- 
tion, the fact that Deviant True and 
Deviant False are completely uncor- 
related leads to a satisfying clarity of 
conceptualization in viewing their 
relative contributions to possible 
stylistic dimensions and to the vari- 


represents degrees of correlation of a 
given scale with the 210-item Deviant 
False key, ranging from — 1.00 to 


DEVIANT TRUE 


(Acquiescence) 


(Hyper-communolit y) ua D 


(Non-communalit y) 


DEVIANT 
FALSE 


(Cautiousness) 


-1.00 


Fic. 2. Correlations of MMPI measures with Deviant True and Deviant False keys. 


ous MMPI scales. Figure 2 presents 
this conceptualization. The abscissa 


VARIANCE IN THE MMPI 


+1.00. The ordinate represents de- 
grees of correlation of a given variable 
with the 356-item Deviant True key, 
from —1.00 to +1.00. The justifica- 
tion of this orthogonal plotting of the 
two response measures is the zero cor- 
relation that exists between them in 
this group of 100 college subjects. 

A relatively pure measure of De- 
viant True is provided by Block's 
(1953) Neurotic Undercontrol scale 
(Nuc) which appears at the top of the 
y-axis. This measure of deviant 
undercontrol may be thought of as 
reflecting the ‘‘yea-saying’’ character 
style described by Couch and Kenis- 
ton (1960). Welsh’s (1956) Factor I 
or A scale is another strong candidate 
for inclusion as a measure of Deviant 
True as has been emphasized by 
Messick and Jackson (1961). It is of 
interest to note that while the K 
scale (at the negative end of the y- 
axis) isa strong measure of the extent 
to which a person does not respond 
in the Deviant True direction, it 
is unrelated to the tendency to give 
deviantly false responses. Pure meas- 
ures of Deviant False are not as 
easily identified among the scales in- 
cluded in this analysis. However, it 
should be noted that Block's Neu- 
Totic Overcontrol scale (Noc), which 
I$ presumably orthogonal to his Nuc 
A and which might be thought of 
m a a scale (Couch & 
Do v on, 1960), is highly related to 

5 eviant False dimension. 
. the most interesting fea- 
sms S scatter plot is that the 
dd at fall within each of the four 
Ho pos have a definite logical rela- 
tains des another. Quadrant I con- 
AUR 3 MMPI clinical scales of the 
tains 3 or variety, Quadrant II con- 
EC X different measures of re- 
ie acquiescence, Quadrant III 
. seven measures of social 
lins 85 d and Quadrant IV con- 

ve scales that are presumably 
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related to repressive overcontrol or 
denial. 

The scales within a quadrant can 
be seen to vary in the extent to which 
they are related to Deviant True, 
Deviant False, or both response 
measures. The 45 degree lines ex- 
tended out through the quadrants in 
a manner analogous to "fusion fac- 
tors" (Kassebaum, Couch, & Slater, 
1959) represent points of balance or 
symmetry between these two re- 
sponse measures. 

That is, scales which fall along or 
near these lines may be considered to 
be equally influenced by both Deviant 
True and Deviant False keyings. Al- 
though the format of the MMPI and 
the purposes for which it was con- 
structed make it unlikely that any 
relevant stylistic dimensions will be 
isolated that are unrelated to Deviant 
True and Deviant False, scales to 
which these measures contribute 
equally have some potential for meas- 
uring other stylistic dimensions. Con- 
sequently it is of interest to consider 
the scales that fall along these balance 
lines. 

In Quadrant I, Pd and the F 
scale are quite close to the balance 
line. The F scale, which has been 
referred to as the “screwball scale" 
(Fricke, 1957) isa relatively powerful 
measure of noncommunality. It con- 
sists of items of highly asymmetrical 
endorsement frequencies which are 
keyed in the direction of deviance. In 
this particular college population, Pd 
is seen to be an even better-balanced 
measure of noncommunality. In 
Quadrant II, the distance of the vari- 
ous acquiescence scales from the 
balance line seems to reflect the ex- 
tent to which care was taken in their 
construction to take out the effects of 
adjustment or communality variance. 
The most successful of the scales, in 
this respect, seems to bethe Rb scale 
developed by the writer from a pool 
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of items of both high controversiality 
and neutral rated social desirability 
value. This scale is interpreted as 
reflecting the tendency to answer 
true to indifferent items or acqui- 
escence as it has been used through- 
out this paper. 

In Quadrant III, the social desir- 
ability scales tend to be heavily in- 
fluenced by the tendency to respond 
in the opposite direction from Devi- 
ant True as represented by the K 
scale. Consequently, most of them 
are some distance from the line that 
would represent a balanced measure 
of hypercommunality. The ESD 
scale contains items of extreme social 
desirability value and it would prob- 
ably be difficult to find individuals 
who did not score high on this scale. 
Cofer's scale (Cof) with its interesting 
item pool, already discussed, falls 
farther from the origin, but cannot be 
considered a completely “balanced” 
scale. The Quadrant IV balance line 
is well represented by Welsh's (1956) 
second factor “repression” scale, Scale 
R. Block's (1953) Ego Control scale 
(Ec-III), (which is presumably a 
fusion factor of the Noc and Nuc 
scales) also falls near this balance 
line. Scales which fall along this 
balance line are interpreted as having 
potential for measuring the tendency 
to deny relatively neutral or non- 
deviant statements in the MMPI 
(Ceautiousness“). 

The point of view taken in this 
paper has been that Deviant True 
and Deviant False contribute to 
scale variance on the MMI because 
they are the receptacles of variance 
due to strategy (differentiation of 
deviant criterion groups), method 
(the uniqueness of the MMPI item 
pool), and style (acquiescence-cau- 
tiousness). It was further assumed 

that scales which were found to be 
equally influenced by these two meas- 
ures could be considered as having 
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potential for the measurement of 
styles that are not peculiar to this in- 
strument. By this reasoning, the Pd 
and F scales were interpreted as meas- 
ures of the strategic dimension of 
communality, and the stylistic di- 
mension of acquiescence-cautiousness 
was assumed to be represented by the 
Rb (response bias) scale at one end 
and the A (repression) scale at the 
other. 

Extending this admittedly specula- 
tive reasoning even further, we can 
venture an interpretation of the pos- 
sible psychological significance of 
Deviant True and Deviant False 
when considered in relation to the 
hypothetical dimensions of com- 
munality and acquiescence. The 
positive end of the Deviant True di- 
mension which is represented by the 
Nuc would be considered to represent 
a fusion of acquiescence and non- 
communality. The opposite end of 
Deviant True, represented by the K 
scale would be thought of as a fusion 
of hypercommunality tendencies with 
cautiousness. Although the Deviant 
False dimension is not purely repre- 
sented by any scale, the Noc, D, Hy, 
and Pa scales could be thought of as 
fusions of cautiousness and noncom- 
munality. 

The above speculations, although 
plausible in the light of the known 
correlates of some of the scales dis- 
cussed, have the status of intuitive 
hunches that must be subjected to 
more rigorous test. It is suggested 
that factor analysis is the method of 
choice and particularly of the type 
advocated by Jackson and Messick 
(1961). It is hoped that the distinc- 
tions made in this paper will guide 
the selection of appropriate marker 
variables" in these analyses such that 
dimensions will be isolated which 
have generality beyond the MMPI 
as an instrument. The stylistic di- 
mension of acquiescence and its com- 


plement of cautiousness would seem 
io have some precedence as a theo- 
tical construct. Likewise, thecon- 
ipt of communality, which stems 
Berg's generally useful Devia- 
Hypothesis would seem to be 
orthy of further exploration. If the 
"MMPI method imposes limitations 
on the development of stylistic meas- 
es that are independent of Deviant 
"True and Deviant False, it would 
‘Seem that efforts should be directed 
"toward the development of balanced 
Or equally influenced measures of 
Other dimensions—and particularly 
that represented by the concept of 
-hypercommunality. 


SUMMARY 


A distinction was drawn among 
three sources of variance that contri- 
Dute to assessment with the MMPI. 
trategic variance arises in the assess- 
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ment of a subject's communality with 
respect to a normative group on a 
dimension defined by contrast with a 
criterion group. Method variance is 
due to the idiosyncratic nature of the 
total item pool in regard to the pro- 
portion of true and false keyings and 
the distribution of item popularity 
values. Stylistic variance includes 
dispositions to agree (acquiescence) 
or disagree (cautiousness) with 
neutral statements, independently of 
item content. With these distinctions 
as organizing concepts, research relat- 
ing to social desirability, acquiescence, 
and deviant response biases in the 
MMPI was critically reviewed. Sug- 
gestions were made concerning the 
measurement of these variables with- 
in the MMPI so that factor analytic 
studies might be guided by marker 
variables of broadened theoretical 
significance. 
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COMPARATIVE PSYCHOLOGICAL STUDIES OF NEGROES 
AND WHITES IN THE UNITED STATES: 
A CLARIFICATION 


BENJAMIN PASAMANICK 
Columbus Psychiatric Institute and Hospital, Ohio 


Dreger and Miller (1960) in their 
recent review of Negro-white dif- 
ferences dismiss our early work on 
Negro child development (Pasama- 
nick, 1946) with the blanket indict- 
ment that “‘Inadequacies of sampling, 
as well as the more general difficulty 
of estimating skin color subjectively, 
tend to invalidate Pasamanick's con- 
clusions.” 

To clarify for ourselves and the 
readers of the review the basis for the 
general rejection of our study we 
entered into correspondence with the 
senior author (Dreger) seeking an 
elaboration and specification for his 
criticism. The result of this corre- 
spondence was the expansion of the 
condemnatory statement into six 
specific objections. Two concerned 
sampling methods, one dealt with the 
infant examination procedures, an- 
other with the possibility of inherent 
bias in the examiner, one with the 
method of dichotomizing the Negro 
infants by skin color as light or dark, 
and finally one which questioned the 
gio and validity of the Gesell 

evelopmental methods used to ob- 
Mic: the DQ for our subjects. 
, The first and most important ob- 
jection involves the contention by 
oo that the Negro infants in the 
. vere born to parents who were 
Ee ell educated to be representative 
€ New Haven Negro population. 
th UR calls attention to the fact 
cuba 1946 the median years of 
Lx ing for the entire United States 
828 Apes 25 years of age and older, 
-6 years and that by 1950 it had 


only risen to 9.3 years, For the 
United States Negro population the 
median years of schooling was even 
lower. For example, in 1940 the 
median years of schooling for Negroes, 
age 25 or over, was only 5.7; even in 
Connecticut the Negro median was 
only 7.6. (Census data taken from 
the United States Bureau of the 
Census, 1952a, 1952b). Hence, since 
the parents of these infants had at- 
tained over 10 years of schooling on 
the average Dreger believed they 
could hardly represent the Negro pop- 
ulation in Connecticut or elsewhere. 

This objection can be answered 
easily. The error lies in lumping to- 
gether the educational attainments 
of all Negores 25 years of age and 
over, thereby including the older and 
far less well educated individuals. If 
one restricts the analysis to New 
Haven and to the age category of the 
parents of the Negro infants, i.e., the 
child-bearing years of 20-40, it be- 
comes readily apparent that they do 
indeed typify the New Haven Negro 
population within this age range. 
Further, it is our contention that the 
educational attainment of parents is 
largely irrelevant as far as the DQ 
of infants is concerned. It was indi- 
cated in the study criticized by 
Dreger that children whose parents 
had an educational level little more 
than grammar school did not differ 
in mean scores from the children of 
parents who had approximately a 
high school education. We confirmed 
these findings in the 1,000 subjects 
involved in our Baltimore study of 
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child development and again found 
that education of parents as well as 
socioeconomic status apparently does 
not play an important role in either 
whites or Negroes until the children 
are 3 years of age or older (Knobloch 
& Pasamanick, 1960). 

The second Dreger objection is of 
the same order. He suggested that 
the white infants in our study were 
also unrepresentative of white infants 
generally. Specifically he cited our 
inclusion of 61 white illegitimate in- 
fants (an adoption group), nursery 
applicants whose parents averaged 
over 17 years of schooling, and a 
third group comprising residents of an 
orphan asylum. 

In response to this objection it 
should be noted that the study popu- 
lation of white infants was not 
selected as being representative of 
New Haven white infants. The major 
comparison in our paper involved a 
contrast of the aforementioned repre- 
sentative Negro infants with the norma- 
tive group used by Gesell for establish- 
ing his developmental norms. 

The white infants in our investiga- 
tion were selected for reasons quite 
apart from this comparison which 
were made explicit in the paper. For 
example, because of a Jamaica study 
on Negro children from a créche for 
whom quite low scores were ob- 
tained we included the institutional- 
ized group of white infants. Our 
adoption group was included in order 
to compare the illegitimate children 
in and out of an institution. Infants 
of white upper and middle class 
parents were studied because of our 
feeling that, regardless of the class 
standing of parents, white and N. egro 
infants do not differ appreciably in 
behavioral development during early 
life. These three groups, the upper 
class white, the adoption, and the 
illegitimate, were specifically excluded 
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as controls. They represented sub- 
study populations and served only 
for comparative purposes. They 
were, in short, three specially selected: 
criterion groups. 

Dreger's third criticism is based 
largely on a misconception of our 
infant examination procedure. The 
objection, simply stated, is that Negro 
mothers most often were present and 
white mothers most often absent dur- 
ing the examination of the infants. 
Since the presence of the mother may 
be presumed to influence positively 
the response of the infants he con- 
sidered that the Negro infants were 
unduly favored. 

The actual procedure which was 
stipulated and enforced, and is indeed 
standard Gesell method, was that the 
mothers or mother substitutes, e.g., 
the nurse who cared for the institu- 
tionalized infants, be present during 
the course of the examination. Such 
was the case for all infants, white 
and nonwhite. 

The fourth contention of Dreger is 
that examiner bias could have been 
a potent factor in uprating the scores 
of the Negro infants. The presump- 
tion is that such bias would of neces- 
sity have favored the nonwhite in- 
fants. The further presumption is 
that such bias could not be ade- 
quately controlled. We, of course, 
are compelled to agree that such bias 
is a possibility and further that there 
is no known method of assessing of 
controlling its influence. On the other 
hand there is no reason to suspect 
that its presence in either direction 
was greater in this than in other in- 
vestigations of the same type. That 
it might not even be as great was 
indicated in an implicit test of bias 
in the Baltimore replication. In that 
study the examiner, who did not 
know the status of the subjects, 
found lower quotients and more 
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damage in the premature than in the 


control infants, both Negro and 
white (Knobloch & Pasamanick, 
1958). 


In his fifth specific criticism Dreger 
pointed out that dichotomizing the 
skin color of Negro infants into light 
and dark is at best subjective and 
probably unreliable as well. He 
argued that such color distinctions 
are not our problem alone but con- 
cluded with the admonition that since 
we dealt with subjects ‘‘as close to 
being only hereditarily influenced as 
it is possible to get" the problem is 
presumably more critical in our 
study than might be the case in other 
studies later in life. 

Again we agree wholly with the 
argument that skin color determina- 
tions are both subjective and often 
unreliable. For precisely these reasons 
great pains were taken to examine 
and standardize our procedures for 
skin color determination. At first 
à color wheel was utilized. Later two 
judges were found to agree with one 
another in the determination of light 
and dark skin with almost perfect 
reliability. Despite this reliability 
We are perfectly willing to grant 
that skin color is not very good evi- 
dence of the degree of mixture of 
White and nonwhite strains. Since 
E found significant differences 
RS nina such a comparison we 
. 5 warranted and necessary to 
b É their work and utilized the 

B echniques currently available in 
s gso. Incidentally, we found no 
Snificant differences. 

* € sixth criticism of our paper by 
Ee. Ie upon his dissatisfac- 
aa Gesell schedules, pre- 
validity. n the score of reliability and 
wong eepect to this point which 
Soe Tequire an undue amount of 

to discuss we would prefer to 


245 


refer the reader to papers which com- 
ment on the adequacy of the Gesell 
schedules and indicate that in trained 
hands correlations of .5 to .75 can be 
secured with the infant scores 2-7 
years later. In addition, interjudge 
reliability with correlations centering 
about .9 are obtained (Knobloch & 
Pasamanick, 1960, 1961). 

In reply to the counter discussion 
given above Dreger wrote the follow- 
ing: (It is given verbatim since we 
are unable fully to comprehend the 
criticism. Nevertheless our attempt 
at clarification, within the limits of 
our comprehension of the points 
made, follows his comments.) 


Your point is well taken that your Negro 
infants were representative of New Haven in- 
fants of their race. I cannot see, however, 
that they are representative of Negro infants 
throughout the country; if they were not 
intended to be so, considerable misunder- 
standing has existed in my own mind—and 
others’, too, I believe. It also seems that 
parental educational level does have some- 
thing to do with representativeness of infancy. 
One of the contentions of hereditarians is that, 
of course, there are wide individual differences 
among Negroes; if then comparisons of whites 
from different levels are made with Negroes 
from the upper levels, an unfair conclusion is 
drawn. As you can see, a modification of the 
selective migration hypothesis is involved here. 
However, if the major comparison you made 
was between what amounts to upper level 
educational Negroes (parent level) and 
Gesell's normative group, I should judge no 
harm would result. For even though you point 
out that his norms are more representative 
they seem to be 


almost certainly from upper classes. At least, 


do no U 
fact that they were not controlled for in the 


earlier work might make it necessary to con- 
clude from that study alone that the lack of 
differences between the two groups could be 
the result of socio-economic factors. 


We are somewhat at a loss to 
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understand the criticism that while 
our Negro infants were representative 
of New Haven infants of their race 
they are not representative of Negro 
infants throughout the country. 
Since, as far as we are aware, no 
representative group of Negro chil- 
dren in the United States has ever 
been compared with any representa- 
tive group of white children for the 
country as a whole, the Psychological 
Bulletin review as a whole would ap- 
pear to lack relevance. It would 
merely have sufficed to state that all 
the groups studied were unrepresenta- 
tive and that only local conclusions 
could be drawn but no generalized 
statements were warranted. 

Of course, the New Haven Negroes 
studied could not be claimed to be 
representative of all Negroes in the 
United States, although I dare say 
they are probably not too unrepre- 
sentative. The important point was 
thesimilarity to whites during infancy 
of a group who later in life would 
almost certainly display intelligence 
tests scores significantly lower than 
the white scores of comparably aged 
whites. The similarity existed despite 
the fact that the children came from 
homes that were lower socioeconom- 
ically and educationally. 

Precisely the same findings were 
described in our Baltimore investiga- 
tion where hundreds of Negro and 
white children were involved. In 
these studies special efforts were made 
to see to it that both racial groups 
were representative of their respec- 
tive populations. By almostalldemo- 
graphic variables Baltimore Negroes 
are probably as representative of 
American Negroes as can be found in 
any disadvantaged group. Selective 
migration would not be expected to 
play any important role in this city 
south of the Mason-Dixon line. In 
our study the N was large enough so 
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that internal comparisons could be 
tested adequately. As stated previ- 
ously, neither race, parental educa- 
tion, or socioeconomic status during 
infancy, was associated with any dif- 
ferences in infant performance. 

A point is made that the Gesell 
norms were derived from upper 
classes and that this is the general 
criticism heard by Dreger across the 
years. Hearsay is, of course, not the 
best evidence. Reference to Gesell's 
(Gesell & Thompson, 1938) descrip- 
tion of the first normative sample in- 
dicates that it was drawn from a 
fairly homogeneous lower middle 
class group. The norms were changed 
slightly in 'Developmental Diag- 
nosis" (Gesell & Amatruda, 1941) 
on the basis of a heterogeneous group 
never described in the literature but 
not very different from the original 
sample. However, I must point out 
that had the norms been derived from 
an upper socioeconomic group (one 
such group is included in the paper 
under criticism) one would presum- 
ably expect that these norms would 
be higher than those derived from a 
lower class group. Accordingly then, 
following Dreger's logic even greater 
differences might have been expected 
in comparisons with the lowest Amer- 
ican socioeconomic group, i. e., Ne- 
groes. However, the fact that no 
differences at all were found should 
suffice to indicate that Dreger's 
statement "that the lack of differences 
between the groups could be the re- 
sult of socio-economic factors" i$ 
quite irrelevant. j: 

We are grateful for the opportunity 
afforded us to reply to the crucial 
questions raised by the Dreger and 
Miller critique of our work and to 
Dreger for his generous permission 
to use our correspondence. It might 
appear that the authors were a bit 
premature in disposing of our work. 
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During the past 15 years, increas- 
ing attention has been given in psy- 
chology to the intensive study of the 
single individual with repeated meas- 
ures gathered over a long period of 
time. For the most part, intra- 
individual differences due to repeated 
testing of the same person have been 
viewed as error variance to be ignored 
or canceled out by some kind of aver- 
aging process. Most of psychology 
has been primarily concerned with 
the development of nomothetic laws 
governing behavior in general in 
which the uniqueness of the indi- 
vidual is completely lost. It is only 
recently with the advent of new con- 
ceptual tools and high speed com- 
‘puters that the statistical study of 
the single case has been seriously 
attempted. 

Searching for a systematic ap- 
proach to the study of individual dif- 
ferences, Cattell (1946) developed a 
scheme that he has called the covaria- 
tion chart, consisting of major refer- 
ence axes upon which a taxonomy of 
experimental designs can be con- 
structed. Originally consisting of 
three primary axes, the covariation 
chart could be outlined as a box-like 
structure with scores across different 
subjects, different tests, and different 
occasions representing the three di- 
mensions. P technique was coined 
by Cattell to denote the special case 
where the intercorrelations among K 
Scores across T' occasions are factor 
analyzed for a single individual. 


! Paper presented at Symposium on the 
Development of Multivariate Experiment in 


Psychology, University of Illinois, Novem- 
ber 14-16, 1960. 
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THE MODEL of MULTIPLE 
TIME SERIES 


Stripped of its psychological con- 
notations, the basic model for P 
technique has been known by the 
statistician as multiple time series. 
A series of observations taken in a 
definite time order usually consists of 
measures which are not independent 
of one another and which conse- 
quently do not satisfy the assumption 
of independence made in most statis- 
tical analyses. The interval between 
successive observations is arbitrary, 
usually a day, a month, or a year, 
depending upon the preference of the 
investigator or the convention of 
society. Everyone is familiar with 
such time series as the cost-of-living 
index, annual population estimates, 
and monthly rainfall. Because time 
is the primary dimension for the 
ordering of variables and the defini- 
tion of the unit of measurement, the 
variety and number of plausible vari- 
ables are almost infinite. Of course in 
any one study, it is only possible to 
include a small number of time series, 
hopefully those observations most 
likely to cast some light on the dy- 
namic nature of the process. 

Traditionally, the greatest amount 
of attention has been given to the 
analysis of the internal structure of 
One or more time series in order to 
understand the factors which in- 
fluence a particular variable such as 
the size of population. The forecast- 
ing of future trends from naturalistic 
observations has grown into a com- 
plex empirical science involving the 
search for statistical regularities in 
time series. More recently, a some- 
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what different emphasis has been de- 
veloped by a growing number of 
statisticians, namely, the multivari- 
ate analysis of correlations between 
time series. It is this latter focus 
which is of central interest to the 
psychologist employing P technique. 
In its simplest form, P technique 
involves no more methodological dif- 
ficulties than any other factor analy- 
sis. The score distributions are as- 
sumed to be reasonably normal with 
complete independence of observa- 
tion from one point in time to the 
next. The fact that variance is meas- 
ured across time rather than across 
individuals makes no difference if in- 
deed the successive observations in a 
given time series are independent. 
One has only to repeat the observa- 
tions over a sufficiently large number 
of occasions to gain stability in the 
intercorrelations and resulting factor 
analysis. It would make the life of a 
psychologist or statistician much 
easier if this simple model for P tech- 
nique actually worked. Unfortu- 
nately, all too often the observations 
made on a single individual through 
time are highly correlated rather 
than independent. Let us take a 
closer look at the nature of such 
Serial dependence and what can be 
done to allow for it in the analysis. 


THE PROBLEM OF SERIAL 
CORRELATION 


Perhaps the most serious problem 
m the psychologist's point of view 
€ effect of repeated measurement 
upon that which is being measured. 
E is restricted in P technique to 
n variables which can be meas- 
75 0 again and again on the same 
is 1 hout serious loss of valid- 
i E very point means that there 
m € wholesale elimination of 
1 important psychological vari- 
s Simply because there is no way 
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to measure them repeatedly. It may 
be possible to develop abbreviated 
alternate forms of some tests in suf- 
ficient quantity to make feasible 
their use for repeated measurement 
of the same individual. Moran (1959) 
has done an admirable job of develop- 
ing a special battery of repetitive psy- 
chological measures for the study of 
mental ability. But in general, the 
range of mental tests which can be 
adapted to repetitive use is quite 
limited. 

Given that a variable can be meas- 
ured repeatedly, there still remains 
the problem of sequential effects 
through time. For any variable which 
involves strong practice, fatigue, or 
adaptation effects through time, the 
degree of serial correlation will be 
high and must be taken into account 
in any analysis of the variable. The 
amount of such correlation between 
serial observations may vary con- 
siderably from one phase in the time 
series to another, as well as from one 
variable to the next, complicating the 
analysis still further. 

The most direct method for deter- 
mining the amount of sequential de- 
pendency of observations in a time 
series is to compute the serial corre- 
lation coefficients for the series. For 
the first serial correlation coefficient, 
usually symbolized as 7: the first 
observation is paired with the second, 
the second with the third, and so on 
until the last observation in the series 
has been reached. Similarly, the 
second, third, and nth coefficients 
can be computed by appropriate lag 
between the two observations consti- 
tuting the pairs being correlated. 
The entire array of serial correlation 
coefficients make up a correlogram 
when arranged in graphic form with 
the magnitude of the correlation ex- 
pressed as a function of the lag. The 
significance of any serial correlation 
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may be tested approximately by us- 
ing any table for ordinary correlation 
coefficients. 

Correlation coefficients may be 
computed between time series in the 
usual manner, regardless of the de- 
gree of serial correlation present in 
the two series. Interpretation of such 
cross-correlations, however, is an- 
other matter. In the first place, the 
stability of the cross-correlation is 
directly affected by the degree of 
serial correlation present in the two 
time series. The degrees of freedom 
for testing the significance of the 
cross-correlation are reduced due to 
the sequential dependency of obser- 
vations. 

For example, suppose two vari- 
ables, x and y, are measured daily for 
100 days in an experiment using P 
technique. The cross-correlation be- 
tween x and y is .30; the first four 
serial coefficients for x are .70, .50, 
40, and .20; and the same coefficients 
for y are .50, .40, .30, and .10. Follow- 
ing Quenouille's suggestion (1952, 
p. 170), Bartlett's formula may be 
used for estimating the number of 
observations equivalent to one inde- 
pendent observation. For the present 
example, the number of paired obser- 
vations needed to equal one degree of 
freedom for purposes of statistical in- 
ference is at least 2.38. If serial cor- 
relation beyond the fourth coefficient 
had been included in the equation, 


1 T Jrary T 2a + 2ray + 
27% +--+ = 2.384... 


the estimate would have been still 
higher. In other words, there are 
fewer than 42 degrees of freedom, in- 
stead of 98, for testing the signifi- 
cance of the obtained correlation, too 
few for a correlation of .30 to prove 
significant beyond the .05 level. If 
the serial correlations had not 
dropped to zero after the fourth lag, 
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the reduction in degrees of freedom 
would have been even more severe. 

A second major difficulty obscur- 
ing the meaning of cross-correlations 
when there is serial correlation in the 
two time series being correlated, 
arises from the fact that significant 
lag correlation may also exist between 
Variable x at one point in time and 
Variable y at a later or earlier point 
in time. In the above example, the 
cross-correlation with zero lag is .30. 
But what about the cross-correlation 
when the observations on x at timet 
are paired with the observations on 
yat time 1, t—2, or t—s? 

Quenouille (1952, p. 170) has sug- 
gested the use of partial correlation 
coefficients to gain insight into the 
relationship between x and y. A 
given time series can be thought of as 
containing two parts: a systematic 
part which is predictable at any point 
in time from the preceding observa- 
tions, and a random or residual part 
after removal of the serial correla- 
tion. The cross-correlation between 
simultaneous random parts in the two 
series can be obtained in a straight- 
forward manner by partialling out 
the systematic parts of both vari- 
ables. If one is interested in deter- 
mining whether the nonsimultaneous 
random parts of two series are corre- 
lated, lagged partial correlation co- 
efficients have to be calculated. 

More recently, Quenouille (1957) 
has expanded upon this method for 
the analysis of multiple time series. 
His technique first involves the com- 
putation of complete intercorrelation 
matrices among the K time series 
including those obtained when each 
series is lagged in time behind the 
others. The second step is to deter- 
mine whether a Markoff scheme is 
operating, and if so, what order 
Scheme is necessary to explain the 
Systematic components in the K time 
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series. The next step is to examine 
the latent roots of the quotients 
RR, where R, is the intercorrela- 
tion matrix of cross-correlations 
lagged s intervals in time, in order to 
determine the specific nature of any 
trend components and relationships 
independent of trend. In addition to 
requiring excessively laborious com- 
putations, Quenouille’s model for the 
analysis of multiple time series is still 
in a strictly experimental stage of 
development. Nevertheless, it serves 
to dramatize the difficulties one may 
encounter in dealing with multiple 
time series of the type often included 
in P technique experiments. 


TRE USE or FACTOR ANALYSIS 
IN P TECHNIQUE 


The extent to which the presence of 
serial correlation distorts the factor 
analysis of zero-lagged cross-correla- 
tions in multiple time series—the 
usual intercorrelation matrix for P 
technique—is not known. Most in- 
Vestigators have been content to 
ignore the problem of serial correla- 
tion. Others have dealt with it in only 
@ very crude manner by adding syn- 


* slightly different approach to 
tiple time series has been taken by G. P. 
illiams and E. K. Harris who have examined 
Ee of serial correlation on principal 
ponents. “The system of auto- and cross- 
correlations among u variables may be ex- 
ed in the form of a symmetric matrix, 
which ing of circulant submatrices, each of 
contains the lagged correlations be- 
ci a given pair of variables. The eigen- 
REM eigenvectors, and inverses of such sub- 
or eae’ known. An analytical solution 
€ eigenvalues of this matrix has not yet 
ELS ined. However, to illustrate the 
este our meteorological variables (tem- 
Nr relative humidity, barometric 
re, and wind speed) have been used. 
Eu consists of daily readings over a 
eee (From abstract of paper 
n at the Annual Meeting of the 
Hie Statistical Association, Chicago, 
5, December 30, 1958.) 
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thetic time variables to the correla- 
tion matrix and eliminating trends by 
the rotation of factors on which the 
time variables are loaded. For ex- 
ample, Mefferd, Moran, and Kimble 
(1958) included three synthetic time 
variables in an analysis of 44 time 
series, each containing 144 observa- 
tions. The first was intended to ac- 
count for any linear trend and con- 
sisted merely of the day number rang- 
ing from 1 to 144. The second was 
designed to pick up any rapid cycling 
effect over a 48-hour period and con- 
sisted of the digits 1, 2, 1, 2... be- 
ginning with the first set of observa- 
tions and running through to the last. 
The third synthetic variable was also 
an oscillating one, but with a slower 
period consisting of the digits 1, 1, 2, 
2...running through the series. 
One last suggestion that has not yet 
been tried out successfully was put 
forth by Demarin (1958) who pro- 
posed the use of Quenouille's partial 
correlation method to correct cross- 
correlations for serial correlation 
prior to carrying out the factor analy- 
sis. 

In a critical review of P technique, 
Anderson (1958) questioned seriously 
the advisability of any of these 
methods for dealing with systematic 
effects, particularly the introduction 
of errorless, synthetic time variables 
which are then treated by factor 
analysis in the same manner às the 
other measures. Although Anderson 
has no general method to offer in lieu 
of factor analysis for dealing with 
large systems of P technique data, he 
does recommend that such trend 
components be eliminated directly by 
regression techniques prior to the fac- 
tor analysis. But as Cattell (1961) 
has pointed out, in the usual study 
where there are many variables, fac- 
tors that are highly correlated with 
time will separate from one another 
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without serious distortion if rotation 
to simple structure is achieved. Un- 
fortunately, the necessary theoretical 
analysis and empirical determination 
of the conditions under which trend 
components due to serial correlation 
distort the factor analysis or render 
it invalid have yet to be done. 

A second major weakness in most 
factor analytic studies employing P 
technique is failure to examine any- 
thing but the zero lag cross-correla- 
tions between the time series. Most 
investigations involving repeated 
measurement on a single individual 
include several systems or levels of 
behavior in the same analysis. These 
systems may range all the way from 
biochemical assay of metabolic prod- 
ucts in urine to behavior ratings in 
social situations. In the comprehen- 
sive study by Mefferd, Moran, and 
Kimble (1958), for example, six dif- 
ferent systems of time series—chem- 
ical, physiological, psychological, 
psychiatric, environmental, and syn- 
thetic—were combined in one analy- 
sis. It is highly reasonable to expect 
some time lag in the relationships of 
variables in one system relative 
to those in another. Biochemical 
changes may precede or follow 
changes in the physiological, mental, 
or behavioral domains. Indeed, as 
Cattell; Mefferd, and others have 
pointed out, it is more plausible to 
hypothesize some time lag between 
certain variables than it is to deal 
only with the correlations across 
simultaneous observations. 

Cattell (1958) has suggested that 
at least the first two or three lag- 
correlations be computed systemat- 
ically for all cross-correlations. The 
largest correlation coefficient (what 
Cattell calls the ‘maximized lead-and- 
lag correlation’’) should then be in- 

serted in the correlation matrix as 
the best estimate of r,, for sub- 
sequent factor analysis. Anderson 
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(1958) objects to this procedure on 
two grounds: only the strongest rela- 
tionship goes into the factor analy- 
sis, completely glossing over the 
other relationships; and the original 
factor analytic model is no longer 
applicable because the degree of lag 
varies from one cross-correlation to 
the next. 

A third criticism of Cattell's proce- 
dure can be made which recognizes a 
difficulty which is rather general in 
P technique. Rarely is it possible to 
obtain a very large number of ob- 
servations through time in psycho- 
logical studies of a single person. The 
presence ofserial correlation reduces 
still further the stability of obtained 
correlations. When one considers the 
large number of cross-correlations 
possible, the likelihood of severe dis- 


tortion due to exploitation of chance : 


is very high. In the data obtained by 
Mefferd and his colleagues, for ex- 
ample, there are 6,622 cross-correla- 
tions if one allows a lag varying only 
from 0 to 3. The number of addi- 
tional cross-correlations increases at 
the rate of 1,892 per lag. Since the 
degrees of freedom for evaluating the 
stability of any one of these coeffi- 
cients will be somewhat less than 142, 
depending upon the degree of serial 
correlation present in the two time 
series, the dangers of Cattell's post 
hoc method of selecting the maximum 
cross-correlations should be apparent. 
Of course, where one has a good à 
priori reason from previous experi- 
mental studies to assume that 4 
given lag between x and y is appro- 
priate, one has a basis for selecting 
certain lagged correlations for special 
analysis. j 
The presence of clearly distinctive 
systems of variables or levels of hu- 
man functioning in the same set 0 
repeated measures over time suggests 
that there may be better methods of 
multivariate analysis than straight" 
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forward factor analysis in which all K 
variables are treated in an identical 
manner. The correlations between 
systems of variables are far more in- 
teresting than the correlations within 
systems. In those cases where such 
distinctive systems exist a priori, as 
in the time series collected by 
Mefferd, Moran, and Kimble (1958), 
Tucker's (1958) interbattery method 
of factor analysis appears to be more 
appropriate than traditional proce- 
dures. Of course, if one really makes 
a thorough analysis of multiple time 
series along the lines suggested by 
Quenouille (1957), only a little extra 
work is required to calculate the 
canonical correlations and variates 
that emerge from the relationships 
between the systems. In this way it 
would be possible to maximize most 
efficiently the nonvanishing correla- 
tions between canonical variates 
within such systems as the biochem- 
ical measures from urinary metabo- 
lites, thepsychophysiological response 
variables, and the scores on tests of 
mental ability. 


SPECIAL PROBLEMS ENCOUNTERED 
IN PsYCHOPHYSIOLOGICAL 
VARIABLES 


E has made a useful distinc- 
: n between trait and state with re- 
diis to the appropriate multivariate 
1985 to employ a given experiment. 
€ distinction can best be illustrated 
2 teference to the concept of anx- 
ty. As Cattell states (1961): 


eee speech recognizes that one can have 
is S iet 510 18 person who all his life 
anxiety | Bice. operating ata higher 
Person wh vel—and a typically non-anxious 
State, Is Be 1s temporarily in a highly anxious 
of expre. ere a continuum in nature and form 
an 8 80 between permanent (trait) 
say, is S seed (state) anxiety? That is to 
emporar racterological or trait anxiety just 
Rice anxiety held permanently 
ifferent rait and state different things with 
patterns (Ch, 9, p. 149)?* 
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j Recognizing the possibility that a 
given measurement applied once to a 
sample of individuals may not in- 
volve the same functional unities as 
the identical measurement applied to 
one individual over a number of occa- 
sions, Cattell has recommended that 
systematic programs of research be 
undertaken in which a number of 
marker variables from R technique 
experiments across individuals is in- 
cluded in P technique experiments 
across occasions. Presumably, then, 
by careful matching, factors emerg- 
ing in studies of the single individual 
can be more clearly identified by ref- 
erence to the marker variables from 
earlier nomothetic studies. While I 
do not share Cattell's optimism that 
there exists only a limited set of 10—20 
response dimensions built into the 
repertoire of the human being and 
that these will emerge in both R and 
P technique experiments, the basic 
idea of common marker variables isa 
good one. 

In dealing with psychophysiolog- 
ical measures, special problems arise 
which make it very difficult to de- 
velop appropriate marker variables 
for inclusion in both types of investi- 
gations. Indeed, it is difficult enough 
to decide which of many possible de- 
pendent variables from a given 
measuring instrument is the most ap- 
propriate to use, whether one is 
studying many individuals or only 
one. Too frequently the choice of a 
particular dependent variable is 
dictated by the peculiar character- 
istics of the apparatus employed 
rather than by any firm confidence in 
the validity of the measure with re- 
spect to some concept such as anx- 
iety or autonomic reactivity. 

The work by Holtzman and Bitter- 


3 Raymond B. Cattell and Ivan H. Scheier, 
The Meaning and Measurement of Neurolicism 
and Anxiety. Copyright 1961, The Ronald 


Press Company. 
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man (1952, 1956), Bitterman, Kraus- 
kopf, and Holtzman (1954), and 
Bitterman, Holtzman, and Barry 
(1955) on the development of meas- 
ures from the conditioning and ex- 
tinction of the galvanic skin response 
is a good example of the problems one 
typically encounters in deriving a set 
of meaningful psychophysiological 
variables worthy of systematic com- 
parison with other data in a factorial 
study. Problems of apparatus design, 
choice of experimental procedures, 
control of physical environmental 
and apparatus variables, units of 
measurement, scaling, correction for 
baseline performance, elimination of 
both experimental and statistical 
artifacts—all plague the investigator 
from start to finish if he takes seri- 
ously the mandate to employ reliable, 
valid procedures that are multivari- 
ate in nature. 

Unlike such molar variables as 
mental test scores or behavioral rat- 
ings, psychophysiological variables 
are often continuous in nature. The 
usual procedure is to employ some 
kind of continuous recording ap- 
paratus such as an electrocardio- 
graph, electroencephalograph, or 
dermohmeter for measuring changes 
in skin resistance. A prearranged set 
of stimuli may follow a period of 
adaptation and, in turn, may be fol- 
lowed by a period of recovery to a 
resting state. Simultaneous recording 
of several physiological measures is 
often undertaken. The entire time of 
measurement may take only a few 
minutes and may be repeated on a 
number of different occasions as in P 
technique studies. Viewed concep- 
tually, such a program of data collec- 
tion can be thought of as successive 
sets of multiple time series on the 

single individual, one set for each 
separate occasion. Scores generated 
from time series analysis in the 
microcosm of the single occasion, in 
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turn become the bits of information 
constituting the multiple time series 
in the macrocosm of the single in- 
dividual across many occasions, out 
of which, in turn, is generated the 
supermacrocosm across many in- 
dividuals. Is it any wonder that one 
may well get discouraged in attempts 
to map out a grand scheme for the 
multivariate description and explana- 
tion of human behavior! 

The selection and quantification of 
variables in the microcosm of time 
series collected on a single occasion 
has been studied recently by com- 
munication engineers, biophysicists, 
and statisticians, employing the 
methods of spectral analysis, autocor- 
relation, and the theory of random 
processes. An excellent overview of 
selected work on the problems of 
data reduction when dealing with 
neuroelectric phenomena is given by 
Rosenblith and his associates in the 
Communications Biophysics Group 
at the Massachusetts Institute of 
Technology (1959). In discussing the 
use of correlation techniques for the 
reduction of EEG data, Molnar, 
Weiss, and Geisler (1959) adopt a 
random process model and analyze 
both autocorrelation and cross-cor- 
relation functions over short periods 
of multichannel recording repeated 
periodically over many days on a sin- 
gle individual, essentially the same 
design as P technique. For the im- 
mediate future, however, it is more 
likely that the investigator conduct- 
ing P technique experiments will have 
to fall back on his intuition or make 
arbitrary choices with respect to the 
appropriate units of measurement 
to employ when dealing with psycho- 
physiological variables through time. 


CONCLUDING REMARKS 


Nothing has been said about the 
methodological issues that are com- 
mon to all forms of factor analysis, 
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regardless of whether the correlation 
matrix is obtained from R technique 
across subjects or from P technique 
across occasions. Enough is being 
said about these more general prob- 
lems by other participants in this 
symposium. It is sufficient here to 
point out that these other issues do 
exist and must also be taken into ac- 
count in P technique experiments. 
No attempt has been made to sum- 
marize, review, or criticize specific 
studies involving P technique, al- 
though several dozen such studies 
have been reported in the past 
decade. Rather, I have chosen to 
stick closely to those methodological 
difficulties peculiar to P technique 
that are as yet largely unsolved. Un- 
til much further work of a funda- 
mental nature on the theory of analy- 
sis of multiple time series, it is un- 
likely that major improvements in P 
technique will be made. It will take 
an unusual combination of mathe- 
matical-statistical sophistication and 
first-hand familiarity with the sub- 
tleties of empirical data gathered by 
repeated measurement of the single 
individual to solve the frustrating 
Problems introduced by the serial 
Correlation present in most time 
series. 

In the meantime, what is the in- 
vestigator to do? While recognizing 


many of the issues outlined above, 
Cattell, Mefferd, and others insist 
that factor analysis is still a reasona- 
bly good tool for the reduction and in- 
terpretation of data gathered through 
repeated measurement of a single in- 
dividual. The model of P technique 
is appealing in many respects. Most 
psychologists would admit that even 
a crude statistical method for study- 
ing the uniqueness of the single in- 
dividual is better than none at all. 

The situation may not be as 
gloomy as implied in the above dis- 
cussion. There are indeed some 
psychological, physiological, bio- 
chemical, and behavioral variables 
that have little or no serial correla- 
tion when measured repeatedly 
through time. Certainly the cur- 
rently advocated procedures of fac- 
tor analysis are reasonably valid in 
such instances. Given extensive com- 
puter resources, the mathematical 
sophistication to follow some of the 
latest methods for analysis of multi- 
ple time series, such as those ad- 
vocated by Quenouille, and a healthy 
skepticism concerning the stability of 
any results on a single case unless rep- 
licated [again and again, there is no 
reason why one shouldn't move ahead 
immediately with such investiga- 
tions, even where a high degree of 
serial correlation is present. 
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Vigilance research concerns the 
ttentiveness of the subject and his 
iity for detecting changes in 
ulus events over relatively long 
s of sustained observation. 
est in this topic has accelerated 
idly and the volume of experi- 
tal findings has increased steadily 
recent years. With investigators 
ead over several continents and 
blishing under the sponsorship of 
nerous military, industrial, and 
cademic organizations, it has be- 
e a major problem to keep up 
: the technical literature. This re- 
isa critical survey of the existing 
terature, with emphasis being given 
© the organization of experimental 
ults under the several theoretical 
potheses which have been ad- 
nced in explanation of the findings. 
e many diverse sources of techni- 
Papers made complete coverage of 
literature difficult, but it is be- 
ed that only a small fraction of the 
pers relevant to contemporary 
fories were unavailable. 

The increased number and pro- 
ductivity of researchers has been 
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associated with a greater variety 
of experimental situations. Those 
covered in this paper generally are 
one of four types: (a) Classical vigi- 
lance tasks, eg, the Mackworth 
Clock Test (Mackworth, 1950). Near- 
threshold transient critical signals 
are randomly presented against a 
background of neutral signals. (b) 
Multiple display situations where à 
critical signal could occur at any one 
of several stimulus sources, e.g. 
Broadbent's Twenty Dials Test 
(Broadbent, 1950). Constant scan- 
ning of the several stimulus sources is 
required. (c) Threshold measure- 
ment, e.g., Bakan (1955). A train of 
signals is presented, starting at ran- 
dom intervals in time, with an in- 
tensity increment at each step until 
the observer detects the signal. (d) 
Observing response experiments, e. g., 
Holland (1957, 1958), where visual 
attending is measured indirectly 
through some other response that sug- 
gests observing of the stimulus dis- 
play. Frequency of observing is then 
related to percentage detection of the 
critical signal occurring on the dis- 
play. 3 

Although many experiments were 
generated by specific practical ques- 
tions, a framework for organizing the 
accumulation of empirical findings 
has not been neglected. Itis now pos- 
sible to distinguish a number of 
explanatory systems. The main pur- 
pose of this paper is to review these 
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models and discuss them in terms of 
their effectiveness in accounting for 
the empirical findings. 


INHIBITION 


Mackworth (1950) advanced the 
first comprehensive interpretation of 
vigilance behavior relating observed 
phenomena of watch-keeping to prin- 
ciples of Pavlovian classical condi- 
tioning. In the same report Mack- 
worth presented extensive data on 
deterioration of criterion perform- 
ance on a number of different tasks 
under conditions of prolonged moni- 
toring—the Clock Test, the Syn- 
thetic Radar Test, and the Auditory 
Listening Test. The Clock Test, used 
most extensively, had a blank circular 
face with a hand that moved one step 
each second. Occasionally the hand 
moved in a double step, according to 
a prearranged schedule, and this was 
the critical signal to be detected and 
reported by pressing a response key. 
Over a 2-hour observation period it 
was found that the percentage of 
critical signals detected was a de- 
creasing negatively accelerated func- 
tion of time, with the greatest drop 
occurring during the first half-hour. 
The analogy drawn with classical 
conditioning was that original condi- 
tioning took place in the demonstra- 
tion period where the conditioned 
stimulus was the double jump of the 
clock hand and the unconditioned 
stimulus was the experimenter's in- 
forming comment Now!“ The con- 
ditioned voluntary response was the 
subject's pressing the key to the 
double jump. Knowledge of results is 
a reinforcing state of affairs. The 2- 
hour observation period then, was 
considered an extinction period where 
the unconditioned stimulus and rein- 
forcement provided by the experi- 
menter was absent. During extinc- 
tion the percentage of detections 

declined, and Mackworth attributed 
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this decline to the growth of internal 
inhibition. Other evidence for inter- 
preting vigilance data in terms of 
Pavlovian conditioning was the tem- 
porary but complete restoration of 
initial performance level by the oc- 
currence of a telephone message to 
the observer in the middle of the 
watch-keeping session. Mackworth 
viewed this as an instance of disin- 
hibition where an alien stimulus pro- 
duced a temporary increase in re- 
sponsiveness. Other evidence for a 
classical conditioning interpretation 
occurred in an experiment when the 
experimenter provided knowledge of 
results after each double jump of the 
clock hand and prevented the occur- 
rence of a decrement in detection. 
This, within Mackworth's explana- 
tory frame of reference, would be a 
reinforcing operation and would be 
expected to keep the performance 
level high. 

Mackworth was obliged to qualify 
a strict interpretation in terms of 
classical conditioning because he was 
unable to obtain anything near com- 
plete failure of responding, i.e., total 
experimental extinction. In fact, for 
the Clock Test, the level of detection 
for the critical signal ordinarily stabi- 
lized at about 70-75%. A state of 
expectancy and self-instructions were 
hypothesized as partly replacing the 
unconditioned stimulus and its rein- 
forcing function. 

The inhibition analysis of vigilance 
behavior came with a ready-made 
development so that the theorist's 
task has been one of coordinating 
aspects of vigilance behavior with 
conditioning phenomena. From the 
standpoint of handling more recent 
results, inhibition does not fare very 
well. For example, a high frequency 
of signals should result in a greater 
vigilance decrement than low fre- 
quency signals because, within the 
classical conditioning framework © 
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theinhibition hypothesis, it represents 
a relatively high frequency of ex- 
tinction trials. Vet, Deese and 
Ormond (1953) and Jenkins (1958) 
show the opposite to be true. Mack- 
worth (1957) himself has come to 
regard the expectancy explanation 
as more important in accounting for 
recent findings, but he was not ready 
to dispense with reinforcement and 
nonreinforcement effects completely. 
Although the inhibition hypothesis 
plausibly accounted for many of the 
observed results of Mackworth’s ex- 
periments, it never gained wide ac- 
ceptance. Mackworth's research 
rather than his interpretation has 
been responsible for generating new 
experiments. Reluctance to accept 
the inhibition explanation also has 
been based on attitudes towards 
theory construction. For example, 
Deese (1955, p. 366) felt it unneces- 
sary to postulate separate inhibitory 
and excitatory processes when a single 
State of vigilance which declines 
under specified conditions will handle 
the data as well. 


ATTENTION 


Broadbent (1953) carried the anal- 
°gy of watch-keeping to Pavlovian 
conditioning even further than Mack- 
worth. However, instead of inter- 
Preting vigilance in terms of condi- 
V he interpreted both in terms 
a en He contends that the 
e will select stimulus subsets 
cun € impinging stimuli because 
155 ae nervous system cannot handle 
ive al volume of stimulation at any 
oa and (b) adequate re- 
5 to one part of the stimulus 
Jw is incompatible with ade- 
pim pene to another part. At 
Vikan, ree properties of stimuli are 
1 0 5 in determining priority of 
a 17 0 Physically intense stimuli 
stimuli e E to be selected than weak 

Stimuli of greater biological 
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importance at the moment havehigher 
priority of selection. Finally, novel 
stimuli, i.e., those differing more from 
immediately preceding stimuli, have 
an increased likelihood of being se- 
lected. 

The decrement in accuracy of de- 
tection is accounted for by the com- 
petition of stimuli in Broadbent's 
view. The repeated application of a 
stimulus results in reduced novelty, 
allowing other parts of the stimulat- 
ing situation to gain priority. Decre- 
ments over time have been reported 
for a variety of response measures. 
The following references are repre- 
sentative, not exhaustive: probability 
of detection (Adams, 1956; Bakan, 
1952, 1953, 1955, 1957; Deese & 
Ormond, 1953; Jenkins, 1958; Jerison, 
1958, 1959; Jerison & Wallis, 1957a, 
1957b; Kappauf & Powe, 1959; 
Mackworth, 1948, 1950, 1957), re- 
sponse time (Garvey, Taylor, & 
Newlin, 1959; McCormack, 1958), 
threshold intensity (Bakan, 1955; 
Garvey, Henson, & Gulledge, 1958; 
McFarland, Holway, & Hurvich, 
1942). 

During a rest period different stim- 
uli are selected allowing the original 
task stimuli to regain novelty. This 
corresponds to the observed improve- 
ment in detection following rest. 
Mackworth (1950) and Jenkins(1958) 
found an increase in probability of 
detection with interspersed rest pe- 
riods. Adams (1956) reported recov- 
ery of decrement following a 10-min- 
ute rest. Similar recovery was found 
in response time (McCormack, 1958) 
and luminance threshold (McFar- 
land et al., 1942) following rest. 

In similar fashion, a new stimulus 
introduced between applications of 
the original stimulus will temporarily 
renew the novelty of the original one 
since it is then different from the im- 
mediately preceding stimulus. Mack- 
worth’s telephone message would fit 
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this category. McFarland et al. 
(1942) observed that forced conver- 
sation with an observer after 105 
minutes of measuring luminance 
threshold produced a marked but tem- 
porary increase in sensitivity. Body 
stretching had a similar effect and 
can be interpreted as new stimulation. 
The same reasoning applies when 
knowledge of results is given following 
each critical signal; the novelty of the 
signal is maintained (Baker, 1959b; 
Mackworth, 1950; Pollack & Knaff, 
1958). We see then that in Broad- 
bent's model, disinhibition and rein- 
forcement are examples of the same 
phenomenon. 

When several sources must be 
monitored some of them have higher 
priority initially, but as the watch 
period progresses attention shifts 
towards previously neglected sources. 
The overall level of performance is 
maintained, but becomes irregular. 
In an experiment using 20 dials as 
signal sources Broadbent (1950) found 
this to be the case. Loeb and Jean- 
theau (1958) reported no decrement 
in a 20 dials test; Howland (1958) 
found the same result using four 
meters. With a three-clock display, 
detection level remained stable (Jeri- 
son & Wallis, 1957a; Jerison & Wing, 
1957) but in comparison with a one- 
clock display Jerison and Wallis 
(1957a) found overall detection level 
was much lower. A fine-grain analy- 
sis in the same report hinted that a 
decrement may have occurred in just 
the first 3-4 minutes of the watch 
with three clocks, but this is quite a 
different order of phenomenon than 
the large decrement for a one-clock 
test unit that develops over a rela- 

tively long time period. 

The critical signal itself serves as a 
novel stimulus and partially restores 
performance. Broadbent used this to 
explain Mackworth’s (1950) finding 
that observers who detected more 


signals maintained a higher level of 
vigilance throughout the watch. 
This interpretation was also related 
to the fact of better performance with 
higher signal rates (Deese & Ormond, 
1953; Garvey et al., 1959; Jenkins, 
1958; Kappauf & Powe, 1959). 

A continuous intense irrelevant 
stimulus will not initially have much 
effect. As the watch continues and 
the original stimulus loses novelty 
any decrement will be accentuated. 
This is more likely with a single stimu- 
lus source. Broadbent (1954) had 
observers monitor 20 dials over a 5- 
day period with noise on Day 3 and 
Day 4. Under noise conditions there 
were significantly fewer responses 
made in 9 seconds or less, than in 
quiet conditions. Using 20 lights he 
found no significant difference, but 
the lights were more noticeable 
signals. Loeb and Jeantheau (1958) 
also using 20 dials reported longer 
response latencies throughout the 
watch with noise and vibration, but 
no changes with time. In a three- 
clock test, Jerison and Wing (1957) 
introduced noise for 13 hours follow- 
ing 3 hour of quiet. A decrement 
developed in the final half-hour. 
With only one clock, Jerison and 
Wallis (1957b) found no effect of 
noise relative to quiet conditions. 

Broadbent’s model consists essen- 
tially of the assumption that selection 
of stimuli is necessary in accordance 
with the three stimulus properties 
given above. The formal develop- 
ment of the model was not carried 
beyond these broad assumptions. Al- 
though Broadbent did not intend to 
cover all of the facts of vigilance, 
even the results he did consider seem 
more specific than the model can con- 
vincingly handle. Application to the 
general trends such as decrement with 
prolonged watch, recovery of per- 
formance with rest, and maintained 
efficiency with continued knowledge 
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of results appeared grossly to follow 
from the principles of stimulus selec- 
tion. In other cases where the experi- 
mental situation was subjected to 
finer analysis it was difficult to see 
how Broadbent's interpretations fol- 
lowed from the model. A number of 
his more detailed applications seemed 
like ad hoc explanations of known 
results rather than predictions de- 
duced from the original principles. 


EXPECTANCY 

The expectancy hypothesis of vigi- 
lance was originally proposed by 
Deese (1955). He began with the 
notion of an excitatory state of vigi- 
lance which determines the probabil- 
ity of detection for any observer. The 
expectancy hypothesis states that: 
(a) the observer's éxpectancy or prediction 
about the search task is determined by the 
actual course of stimulus events during his 
previous experience with the task, and (5) the 
er's level of expectancy determines his 


vigilance level and h hi ili f 
detection (p. 362). enos oe 


It should be emphasized that the 
second part of the hypothesis does 
not, for Deese (1955), imply that 
level of vigilance is directly deter- 
mined by expectancy (pp. 364-365). 
The level of vigilance for any ob- 
server is also subject to modification 
due to changes in his motivational 
pates whereas his extrapolation of 
uture stimulus events might not be 
affected by such changes. Deese 
pu to avoid the artificial situa- 
on Where expectancy completely 

. vigilance. These states, 
in divi ru to Deese, are the basis of 
in opes differences in vigilance and 
Es € psychologist's task to dis- 
predi "Yide: of behavior which 
- Eon evels of vigilance expected of 
9 75 ividual in a search task. But 

Die nonexpectancy states serve 
abil; y to raise or lower the prob- 
ity of detection by some constant 
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amount throughout the task, or is the 
form of the detection curve over time 
changed as well? Deese does not 
clarify this matter too well, but a free 
interpretation of his exposition on 
vigilance (Deese, 1955) would suggest 
that the nonexpectancy states deter- 
mine a base level for an individual's 
vigilance. Expectancy, however, 
determines both the overall level and 
the short range variations in prob- 
ability of detection. It is assumed 
that the average level of expectancy, 
and thus detectability, is a positive 
function of signal rate, while the 
short range variations in expectancy 
are determined by the ongoing inter- 
signal interval. Deese assumes that 
expectancy is determined by all of 
the past stimulus events in the task 
and he elaborates this notion by re- 
lating expectancy to intersignal inter- 
val and stating that it increases up to 
the value of the mean intersignal 
interval and beyond. Thus, it would 
appear that probability of detection 
would be below average when an 
intersignal interval is less than the 
mean interval, and equal to or greater 
than the average probability of de- 
tection when the intersignal interval 
is equal to or greater than the mean. 

About the only evidence that can 
be found for the expectancy hypothe- 
sis is that probability of detection isa 
positive function of signal rate (Deese 
& Ormond, 1953; Jenkins, 1958). 
However, little or no evidence can be 
found in support of Deese's views of 
expectancy as à function of inter- 
signal interval. In analysis of some 
of his own data (Deese & Ormond, 
1953), Deese found little effect of 
interval size, although a slight tend- 
ency for higher probability of detec- 
tion for longer intervals could be 
considered small support for the 
expectancy prediction. Analysis of 
intersignal interval has not led to any 
consistent results even yet. Jerison 
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and Wallis (1957a) and McCormack 
(1958) found no effect of interval size. 
Jenkins (1958) reported detection 
dropped monotonically with increas- 
ing intervals when average rate was 
as high as 480 per hour; at lower rates 
he found no effect. Bartlett, Beinert, 
and Graham (1955) found lower prob- 
ability of detection with longer inter- 
vals using a 40 per hour signal rate. 
Mackworth's data showed better 
detection for brief intervals than for 
his longer 10-minute intervals. Kap- 
pauf and Powe (1959) reported a U 
shaped function in an audio-visual 
checking task. One can scarcely im- 
agine a more varied set of results and 
considering average rates and ranges 
of time intervals does not resolve the 
conflict among these data. Jenkins 
(1958) suggested that the average 
rate of signals has a much greater 
effect on detection level than short 
range fluctuations, so the issue be- 
comes less critical from a practical 
standpoint. Harabedian, McGrath, 
and Buckner (1960) emphasize that 
for a basic understanding, a major 
methodological problem exists in de- 
fining an intersignal interval because 
it can be expressed in terms of (a) 
time between signals, whether the 
signal is detected or not; (b) time 
since the last detected signal; and (c) 
time since the last missed signal. 
Their results from audio and visual 
vigilance tasks revealed differences 
dependent upon the method chosen 
to define the interval. 

Baker (1958, 1959a, 1959b, 1959c) 
has elaborated Deese's expectancy 
hypothesis and has provided a body of 
experimental evidence in support of 
his own views. A major portion of 
Baker's arguments in applying the 
expectancy model to experimental 
variables rests on the single consider- 
ation that an operator's expectancy 
is based on how he perceives the 
actual series of stimulus events, Any 
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variation which makes confirmation 
of expectancy more likely or which 
allows more accurate perception of 
the actual stimulus events should 
lead to better performance. For ex- 
ample, when a signal is missed and 
the observer is unaware of the omis- 
sion, the intersignal interval is in- 
creased and expectancy is lowered. 
The concern of Harabedian et al. 
(1960) with problems of defining 
intersignal interval would seem to be 
close to Baker's interests here. 

Operationally, Baker's expanded 
definition of expectancy has five 
major classes of variables: 

Average signal rate. Baker's pri- 
mary interests have been in predict- 
ing short range variations in detec- 
tion as a function of intersignal 
interval, but he would seem to agree 
with Deese and Jenkins that detec- 
tion probability is a positive function 
of average signal rate (Baker, 1959c). 

Regularity and range of the inter- 
signal interval. Regularity of the 
signal increases the probability that 
the expectancy state will be rein- 
forced. Baker contends that expect 
ancy grows as the interval following à 
signal increases to the value of the 
mean intersignal interval and, beyond 
the mean value, expectancy falls to à 
low level. Notice that this is a modifi- 
cation of Deese's view (1955). Baker 
(1959b) tested this hypothesis in 4 
reaction time experiment similar to 
that of Mowrer (1940). He measured 
button pressing response times to an 
initial series of 20 light signals and 
then varied the interval before the 
final twenty-first signal. Using @ 
2X2 factorial design the initial series 
was presented at 10-second or 2 
minute mean intervals with regular of 
irregular intersignal intervals. Fol- 
lowing the regular 10-second series 
changes in reaction time to the 
twenty-first signal paralleled the pre 
dicted course. For very short inter- 
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vals reaction time was long and fol- 
lowing a decrease as the mean interval 
was approached reaction time again 
became somewhat longer up to the 
highest value tested (30 seconds), but 
not nearly as long as the reaction time 
for short intervals. The irregular and 
longer interval condition showed little 
variation in reaction time to the 
twenty-first signal although when à 
trend appeared it tended to support 
the expectancy hypothesis. This ex- 
periment suggests that in vigilance 
tasks, where the signals are always 
irregular and usually occur at low 
average rates, one should not expect 
to find a large effect of intersignal 
interval. 

The range of intersignal intervals 
was related to the occurrence of 
decrement, and apparently Baker has 
been the first to demonstrate this 
phenomenon. Using a simulated 
PPI display, Baker (1958) found no 
decrement when the intersignal inter- 
vals ranged from 36-196 seconds, but 
in a later study (Baker, 19592), using 
the same task, found a decrement 
when the interval range was increased 
to 45-645 seconds. Interestingly, this 
latter range of intervals was that 
used by Mackworth in generating his 
well-known decrements in detection 
With several different visual and 
auditory monitoring tasks. In an- 
other experiment, using a simulated 
B-scan radar display, Baker (1959b) 
assessed the effects of complete signal 
regularity with an occurrence every 

minutes, a random series with a 
range of intervals from 1 to 6 minutes, 
ànd a wide range of intervals where 
x Spread was 3—10 minutes (ran- 
8 — 0 arranged). Signal frequency 
a e same for all groups, being 24 
im . Decrement was found only 
nola € group with the widest range 
e See eoe intervals. Baker(1959c) 
3 interpret this as the subject 

oning any efforts to form ex- 
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pectancies because it is done too 
imprecisely when intervals are long. 

Knowledge of results. This variable 
prevents a decrement by allowing an 
accurate perception of the sequence 
of stimuli. Baker (1959b) tested 
three groups of subjects given (a) no 
information; (b) complete informa- 
tion on correct, missed, and false 
signals; and (c) repetition of a missed 
signal at five-second intervals until 
detected. The task was visual detec- 
tion, the observation time 1 hour, and 
the signal rate 24 an hour. Only the 
group with no feedback had a signifi- 
cant decrement. Mackworth (1950) 
earlier had found that informing an 
observer of his success or failure in 
detecting a signal served to com- 
pletely eliminate the decrement in 
detection. Pollack and Knaff (1958) 
obtained results similar to Mack- 
worth's. 

Knowledge of signal location on a 
visual display. Knowledge of signal 
location makes confirmation of ex- 
pectancy more likely. With increas- 
ing variability in location the ap- 
propriate part of the search area may 
not be scanned when the signal occurs. 
This leads to a lower apparent signal 
frequency and lower probability of 
detection. While Baker is concerned 
only with spatial variables as they 
influence temporal expectancy, Mack- 
worth (1950, pp. 58-59) implies a 
spatial expectancy for signal occur- 
rence at one or more locations as a 
state distinct from temporal expect- 
ancy. Deese and Ormond (1953) 
varied the distribution of signals on 
a radar display presenting 50% in 
one quadrant. Detection in the high 
probability quadrant wasonly slightly 
superior to the other three during 
an hour period, but the overall prob- 
ability was very high. In a similar 
experiment, Nicely and Miller (1957) 
found greater detection in the more 
frequent quadrant, the difference 
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increasing in the last half-hour of 
the 74-minute watch. Level of de- 
tection in the high probability area 
remained relatively constant through- 
out. Bartlett et al. (1955) used the 
method of constant stimuli to meas- 
ure brightness thresholds. Knowing 
the time but not the location of a 
signal led to higher thresholds than 
knowing both the time and place of 
occurrence. A much larger decrement 
in performance was observed when 
neither time nor location was known. 
Krendel and Wodinsky (1959) meas- 
ured time to detect a randomly lo- 
cated light signal when the time of 
onset was known. They found no 
decrement in search time over an 
hour. Finally, Garvey et al. (1958) 
reported that no increase in stimulus 
intensity was necessary to detect a 
signal appearing after 60 minutes of 
monitoring provided the observer was 
warned of signal time and location 
before it appeared. Without such 
knowledge observers showed a large 
increase in visual threshold. 

Signal intensity. Baker holds that 
expectancy is more likely to be con- 
firmed with more intense signals. 
Both Adams (1956) and Mackworth 
(1950) found higher probability of 
detection for visual signals of higher 
intensity than lower intensity. And, 
if we assume that perceived signal 
intensity is related to the duration of 
the signal, Adams (1956) found a 
higher probability of detection for 
signals of 2 seconds in length than for 
signals of 1 second. 

Looking at the combined efforts of 
Deese and Baker in forwarding the 
expectancy hypothesis we again have 
a theory at the early stages of develop- 
ment making qualitative predictions 
about vigilance behavior. The under- 
lying assumptions were set forth 
more clearly than was the case with 

Broadbent's attention hypothesis 
with the result that the expectancy 
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hypothesis lends itself more readily 
to testing. A case in point is Baker's 
report (1959b) initiated with the in- 
tention of evaluating the model. The 
expectancy hypotheses do not grapple 
explicitly with the classical vigilance 
issue of decrement accruing over ob- 
servation time, which might be listed 
under long range effects, and distin- 
guished from short range effects where 
momentary determiners of response 
(intersignal interval, spatial location 
of the signal, etc.) are emphasized. 
These latter effects intrigue expect- 
ancy theorists. Other variables 
known to be important are largely 
neglected by Deese and Baker, e.g., 
rest periods and environmental fac- 
tors such as presence of the experi- 
menter, interpolated messages, and 
noise. Oddly enough, Deese (1955) 
devoted some space to the importance 
of varied background sensory input in 
maintaining vigilant behavior but he 
did not relate it to expectancy and 
thus might be said to have a two- 
factor theory. Baker (1959c) men- 
tioned environmental factors as pos- 
sible distractions that would lower 
apparent signal frequency, but these 
factors were not formally entered into 
his theory. 


VARIED SENSORY ENVIRONMENT 


Scott (1957) explored Hebb’s thesis 
(1955) that stimuli serve a dual func- 
tion: (a) they have a cue function in 
controlling goal responses (the func- 
tion usually ascribed them in learning 
theories), and (b) an arousal or vigi- 
lance role to which Hebb (1955) 
ascribes motivational properties. 
Scott feels that the arousal function 0 
stimuli has been largely ignored and 
should be given more attention. 
Broadbent (1958) calls this the 
"activationist hypothesis" in vigi- 
lance research, and his views (Broad- 
bent, 1953) on stimulus variety were 
somewhat similar. To document 
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implications of arousal for vigilance, 
Scott surveyed the literature con- 
cerned with performance deteriora- 
tion in a variety of repetitive tasks 
with particular attention to the uni- 
formity of sensory environment that 
accompanied such activities. He 
concluded that loss of efficiency was 
directly related to reduction in stimu- 
lus variation. When background 
stimuli are at a minimum and only 
occasional and often low key critical 
stimuli are present, rapid deteriora- 
tion should be expected. 'The more 
unchanging are the critical stimuli, 
the sooner deterioration will occur. 
Rest periods and introduction of 
extraneous stimuli serve to increase 
the variety of stimulation needed to 
maintain or restore efficient behavior. 

Neurophysiological as well as be- 
havioral research support the im- 
portance of a secondary role for 
stimuli. Impulses from the same 
Sensory stimuli have been shown to 
teach the cerebral cortex via two 
different pathways. They travel 
directly along the sensory tract to 
the corresponding nucleus in the 
thalamus and terminate in a specific 
Projection area of the cortex. A sec- 
ond pathway has been studied, 
wherein impulses from the same 
Stimuli travel a slow circuitous route 
p the ascending reticular acti- 
dif ng system which discharges a 
E ur bombardment over wide areas 
op. F cerebral cortex. The latter 
ee E stimulation is consid- 
obstet 1 0 for the maintenance 
(1955) ehavior. Scott (1957), Hebb 
(1959) Lindsley (1957), Malmo 
arte and Samuels (1959), sum- 

rize the experiments related to 
this work. 


ue the nonspecific effect of 
wi on behavioral organization, 


ee Suggested that stimuli lose 
nonspecific effects with con- 


tin 
ued exposure, the rate of such 


habituation increasing as the environ- 
ment is more uniform. This process, 
termed ‘‘sensory habituation,” results 
in a wide range of modifications in 
behavior of which loss of vigilance is 
one of the earliest to appear. Under 
conditions of severe isolation over 
extended periods more serious symp- 
toms such as hallucinations appear, 
as in the McGill studies of sensory 
deprivation. 

The sensory habituation theory 
finds application to vigilance tasks in 
a number of ways. One would expect 
to find performance restored to or 
maintained at a higher level under 
conditions which increase the variety 
of either peripheral or relevant task 
stimuli. Examples cited by Scott 
(1957) included: rest periods, high 
signal rate, knowledge of results, 
interpolated messages, use of tasks 
with multiple stimulus sources, and 
presence of the experimenter. Data 
relevant to most of these factors has 
been summarized in earlier sections, 
with the work of McFarland et al. 
(1942) being particularly relevant. 
The results from vigilance tasks with 
multiple stimulus sources support 
Scott’s position quite well and, in 
fact, his is the only successful theory 
in this vein (Broadbent, 1950; Hoff- 
man & Mead, 1943; Howland, 1958; 
Jerison & Wallis, 1957a, 1957b; 
Jerison & Wing, 1957; Loeb & Jean- 
theau, 1958). All of these studies 
have the distinctive feature of show- 
ing no vigilance decrement whatso- 
ever—a puzzling but consistent find- 
ing that has received little systematic 
attention. Most investigators have 
chosen to use tasks where decrement 
is known to occur and largely have 
ignored the potential for understand- 
ing vigilance behavior that might be 
found in studying the tasks that fail 
to yield detection decrement. For 
example, one might entertain the 
hypothesis that the sensory inputs 


266 


sustaining responsiveness might arise 
from the proprioceptive stimulation 
derived from head and eye move- 
ments. It appears that under some 
experimental conditions, not yet 
clearly defined, task complexity and 
variety eliminates vigilance decre- 
ments in most cases. A negative in- 
stance is Garvey et al. (1959), where 
a decrement was found with a multi- 
dial task using a very low signal rate. 
Their task was somewhat different 
from the conventional vigilance task 
because they required the observer 
to detect a larger deviation occurring 
to a constantly moving needle in each 
dial. The very low signal rate (aver- 
age of 2.5/2 hours) may be the reason 
for the difference because Howland 
(1958) used the same general type of 
task and found no decrement. 

Scott was not proposing a theory of 
vigilance in his paper, but the rele- 
vance of his view to this area is clear. 
He provided convincing evidence for 
the presence of perceptual variation 
as a necessary condition in maintain- 
ing alertness. Although Mackworth 
(1950) and Deese (1955) have noted 
the importance of such variation, this 
point has not been formally incorpo- 
rated in any of the models dealing 
specifically with vigilance. It is 
worthy of attention. 


OBSERVING RESPONSES 


The analysis of vigilance behavior 
in terms of rate of observing responses 
is not a theory but a technique for 
studying vigilance. Theory enters 
the picture only in the assumption 
that detection of a signal serves as a 
reinforcement for the observing re- 
sponse (Holland, 1958). Holland 
(1957, 1958) has been the major pro- 
moter of this type of analysis. His 
purpose was to show the influence of 
schedules of reinforcement on rate of 
observing response and the parallel 
influence on detection performance. 


JUDITH P. FRANKMANN AND JACK A. ADAMS 


The extent to which rate of observing 
and probability of detection follow 
the same course would determine how 
much of the detailed knowledge about 
schedules of reinforcement (e.g., 
Ferster & Skinner, 1957) could be 
carried over directly to vigilance 
behavior. The observing response 
studied by Holland was that of press- 
ing a key to illuminate a dial. The 
pointer on the dial deflected from the 
null position at intervals set by the 
schedule of reinforcement and re- 
mained deflected until the observer 
reset the pointer by pressing a second 
key. The observer could only see the 
dial by pressing the key. 

Holland studied rate of observing 
as a function of several common rein- 
forcement schedules to test the as- 
sumption that detection serves as à 
reinforcement. On fixed interval 
schedules ranging from 1-4 minutes, 
observers learned temporal discrimi- 
nations reflected by “scallops” in the 
cumulative response curves during 
the last of eight 40-minute sessions. 
During extinction the rate remained 
high for a time and then gradually 
decreased to a low level. 

Following a fixed ratio schedule 
with ratios increasing from 36-200 
responses per reinforcement, Holland 
reported a higher rate of observing 
with higher ratios. Extinction curves 
were typical in showing spurts of high 
responding in a jagged decline in rate. 
These results along with successful 
training on multiple schedules and 
responding at low rates led Holland 
to conclude that signal detection 
could serve as a reinforcement for 
observing responses. His next steP 
was to use schedules of signal presen- 
tation identical to those in typica 
vigilance tasks. 

Rate of observing was measured on 
variable-interval schedules with aver- 
age intervals of 15seconds, 30 seconds; 
1, 2, and 3 minutes. These covere 
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the range in terms of signal rate from 
20-240 signals per hour. Observing 
rate was higher with higher signal 
rates. The 3-minute interval led to a 
decline in rate over time paralleling 
the decrement in percentage of signals 
detected reported by Deese and 
Ormond with a corresponding 20 
signals per hour. Among the other 
values studied, both 15-second and 
30-second intervals led to an increase 
in response rate with time while a 2- 
minute interval showed a decline. 

In another experiment using Mack- 
worth's (1950) schedule (2, 4, 13, 2, 2, 
1, 5, 1, 1, 2, 3, and 10 minutes), the 
signal was transient, allowing meas- 
urement of both percentage detection 
and rate of observing over a 2-hour 
period. The similarities to Mack- 
worth's results were good. Holland 
found 39% of his observers missed 
one or no signals, Mackworth found 
29%. Separating the ‘‘good’’ ob- 
servers from the poor“ observers, 
Holland plotted separate curves for 
the two groups relating percentage 
detected and rate of observing as a 
function of time in half-hour periods. 
The poor observers showed a sharp 
decline over the first half-hour in both 
Measures, with a continued gradual 
drop until the last period where some 
recovery occurred. For the good ob- 
Servers percentage of signals detected 
did not decline and observing rate 
increased according to a negatively 
accelerated function of time. 

Holland supplements his paper 
With analogies between findings of 
Vigilance studies and animal studies 
Eun use the Skinnerian cumulative 
in aoe frequency method of record- 
dae by Holland. He cited 
dE (1956) as finding higher re- 
e rate for rats under Benzedrine, 
(1950) 8. to Mackworth's result 
yii of increased signal detection. 

m r and Skinner (1957) produced 

erent rates of response using 


multiple reinforcement schedules in 
the same session with animal subjects 
and this corresponds to Nicely and 
Miller’s (1957) result of higher prob- 
ability of detection for the quadrant 
on a radar scope having a higher 
signal rate. Holland also discusses 
increased rate of responding following 
rest for both animals and vigilance 
studies. Holland further cites the 
evidence that higher room tempera- 
tures produce lower response rates in 
animals, quite analogous to Mack- 
worth’s finding (1950) that detection 
is lowered under these circumstances. 

The method of studying vigilance 
behavior through observing responses 
is subject to criticism on several 
grounds. Requiring an overt response 
such as key pressing introduces an 
element into the situation that is not 
present in free scanning vigilance 
tasks. There is the implicit assump- 
tion not only that the viewer looks at 
the display every time he presses the 
key, but also that this is the same 
scanning response that would occur if 
the subjects were not required to 
press the key. An equally reasonable 
interpretation is that the subject 
presses the key rapidly in order to 
keep the display illuminated so he 
can scan when he wants to, and the 
very high rates of responding (Hol- 
land, 1957, 1958) suggest this as the 
case. Furthermore, repetitive rapid 
pressing of a key can produce work 
inhibition or fatigue. Changes in rate 
of responding under these circum- 
stances would not necessarily reflect 
the same laws of observing if head 
and eye movements were to be meas- 
ured directly and motor fatigue was 
trivial. A paper by Blair (1958) de- 
scribed an observing response that 
makes the correspondence to normal 
scanning more likely than in the case 
of key pressing, at least for the mov- 
ing head component of visual scan- 
ning. The operator was in a darkened 
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room and had a continuous light 
source on his head that had to be 
directed at the display to see the 
critical signal. A light-sensitive 
germanium diode activated a re- 
corder whenever the operator 
“looked” at the display, thereby giv- 
ing a complete record of frequency 
and duration of observing responses. 
Only two of his five subjects exhibited 
Holland's finding of increased ob- 
serving as signal time approached, 
suggesting difficulties that could be 
devastating for Holland's equivalence 
of Skinnerian observing responses 
and sense receptor orientations. 
Mackworth and Mackworth (1958) 
reported a precise method of measur- 
ing eye fixations with closed circuit 
television methods. The direct meas- 
urement of eye movements (head 
movements are excluded by the tele- 
vision method) under conditions 
where the observer is allowed to scan 
a display freely can be used to test 
the hypothesis that more remote re- 
sponses such as pressing a key are 
equivalent and yield the same laws of 
observing. Until such verification is 
made the indirect approach to the 
study of vigilance must be viewed 
with some reservation. Perhaps this 
potential source of difficulty in de- 
veloping laws for observing responses 
arose from applications of Wycoff's 
(1952) general definition that held the 
observing response to be that be- 
havioral act which produces the dis- 
criminative stimuli correlated with 
reinforcement. Thus, orienting the 
eyes and head to receive stimuli being 
emitted from a display would be an 
example of an observing response, 
and Wycoff freely uses this example, 
but it is clear that his definition is 
broadly conceived and includes any 
response which produces the dis- 
criminative stimuli for the organism. 
Prokasy (1956) and Lutz and Perkins 
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(1960) followed Wycoff’s lead and 
studied observing responses which 
were not the sense receptor orienta- 
tions so important in vigilance re- 
search. Holland (1957, 1958), how- 
ever, proceeds one step further and 
interprets the observing response of 
key pressing to illuminate the display 
in a vigilance experiment as having 
direct correspondence with sense 
receptor orientations and to vield the 
same behavioral laws. This is an ad 
hoc assumption which can be, and 
must be, proved empirically in the 
laboratory. 


CONCLUSIONS AND SUMMARY 


The main shortcoming of our con- 
temporary theories of vigilance would 
seem to be a casualness of formulation 
that makes the definitive testing of 
implications rather difficult. The 
inhibition hypothesis has an implicit 
organization provided by the classical 
conditioning paradigm, a wealth of 
relationships derived from the study 
of responses in the classical condition- 
ing situation, and the several theo- 
retical explanations of classical condi- 
tioning, but this framework for the 
vigilance problem appears to be more 
of an analogy than a scientifically 
useful theoretical system that is 
capable of rigorously accounting for 
the present facts and predicting new 
experimental findings. Even if we 
grant the classical conditioning 
schema a higher status than analogy, 
its capabilities for relating to the 
known data of vigilance experiments 
are limited. For example, Mack- 
worth (1950) sees the period of con- 
tinuous observation in a vigilance 
task as a period of experimental 
extinction. As this extinction peri 
continues the expectation would be 
for a steadily decreasing probability 
of occurrence for the detection re- 
sponse. Yet, this typically is not the 
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case. Mackworth's own data showed 
the probability of detection function 
stabilizing at an intermediate level, 
with no apparent trend toward com- 
plete extinction, and Mackworth 
acknowledged this difficulty by sug- 
gesting that other factors such as 
expectancy and  self-instructions 
might be required to account for the 
data trends. Additional explanatory 
shortcomings of classical conditioning 
are found in the high level of respon- 
siveness induced by a high signal rate 
(the classical conditioning schema 
would predict just the opposite be- 
cause high signal rate would consti- 
tute many massed extinction trials), 
the effects of intersignal interval, and 
the general failure of detection decre- 
ment to occur in complex tasks with 
multiple stimulus sources. With all 
of these weaknesses, it is doubtful 
whether the inhibition hypothesis 
deserves serious attention in any 
efforts directed toward developing a 
satisfactory theory. It is also doubt- 
ful whether Broadbent's attention 
hypothesis can be refined sufficiently 
to account for all of the known find- 
ings and to provide new deductions 
that can be given decisive experi- 
mental evaluation in the laboratory. 
Broadbent's loosely structured views 
generally have prevented them from 
being instruments to guide the experi- 
ments of laboratory workers in this 
area, and it is difficult to see them 
ever becoming useful unless a basic 
rephrasing of their tenets is under- 
taken to give the precision that a 
Science asks of a theory. 

The expectancy hypothesis cannot 
be said to have a precise expression 
but certainly it has been a good heur- 
Istic device in stimulating a number 
of experiments, Deese's initial expres- 
Sion of expectancy (1955) was a broad 
one and was based on the empirical 
relations between the response meas- 
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ure of detection probability on the 
one hand and independent task vari- 
ables of signal rate and intersignal 
interval on the other. The principal 
research on expectancy however, has 
been by Baker (1958, 1959a, 1959b, 
1959c) who has revised the Deese 
formulation of expectancy and in his 
research has mainly concentrated on 
the short range variations in per- 
formance as a function of intersignal 
interval rather than the overall detec- 
tion level based on signal rate. While 
Deese was unable to verify his hy- 
pothesis about expectancy and the 
intersignal interval, Baker found 
some support for his elaborated ver- 
sion of expectancy which he related 
to more variables than Deese. Baker's 
expectancy hypothesis is quasiquali- 
tative at this time but his approach 
is amenable to quantitative expres- 
sion. It is not complete, however, 
because it does not try to account for 
decrement as a function of observa- 
tion time, marked gains over rest, or 
the typical absence of decrements for 
multisource complex tasks. s 
The sensory variation or activa- 
tionist hypothesis is secured in pro- 
vocative physiological hypotheses 
about the role of ascending reticular 
activating system in maintaining 
responsiveness and, by appealing toa 
proposed organismic requirement for 
stimulus variation if performance 
level is to remain high, most of the 
facts of vigilance research can be ex- 
plained in a general way. Behavior 
theories have always emphasized the 
guiding role of stimuli acquired 
through learning, and properly so, 
buttheserecent physiologically-based 
hypotheses stress that stimuli have a 
maintaining function for the response 
too. Thus, the monotony of the 
vigilance situation is interpreted to 
be an absence of stimulus variety 
needed to maintain the response level, 
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and variables such as high signal rate, 
rest, knowledge of results, task com- 
plexity, etc., are taken to be opera- 
tions which promote stimulus varia- 
tion and high responding. This 
activation hypothesis is useful after- 
the-fact but it remains to be expressed 
carefully before-the-fact so that dif- 
ferential prediction can be made. 
These clues derived from the physio- 
logical level are suggestive but they 
are insufficient by themselves. The 
theorist still has problems at the 
molar level where type and amount 
of external stimulation must be re- 
lated to overt behavior. Ideally it is 
desirable to coordinate the molar con- 
cepts and functions with those at the 
physiological level, and the increased 
vigor of physiological research sug- 
gests that these echelons of organ- 
ismic action eventually will be inter- 
related. But with all of our present 
vigilance data at the molar level, it 
would be fruitful at this time to look 
for an expression of the stimulus 
variation hypothesis in terms of 
stimulus control of responses for the 
whole organism. The possibilities for 
dimensions of stimulation, both on 
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As this review of recent research 
will indicate, there is continuing in- 
terest among students of mental re- 
tardation in the relationship between 
sociometric (or peer choice) status 
and the level of ability of the person 
being chosen. Our aim is to review 
and evaluate representative investi- 
gations of this relationship among 
normal children, institutional re- 
tarded children, and retarded chil- 
dren attending regular or special 


classes in public schools. But our 
focus is on the institutional re- 
tardate. 


. The educational and institutional 
importance of the relationship rests 
on the notion that the interpersonal 
*nvironment is a powerful determi- 
nant of development, and on the 
further notion that the interpersonal 
environment of the child is com- 
posed predominantly of peer group 
tione. A given interpersonal en- 
ironment may be assessed as facilita- 
tive or restrictive of development. 
m against school and institutional 
ES ards of training, achievement 
üd performance, it is important to 
dd the extent to which the peer 
ad onment rewards or penalizes 
tem differentiated on abilities. 

it groups of adults, groups of 
Em exhibit structures of dif- 
2: lated preferences. Some mem- 
3 will be preferred more by others. 
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Obviously, there are many correlates 
of such structures, and these vary for 
any group according to the criterion 
used for expressing preference. A 
preference structure will usually have 
as correlates measures of homophyly, 
homogamy, propinquity, social con- 
formity, social initiative or domi- 
nance, as well as more esoteric as- 
pects of personal attractiveness. 

It is equally likely that the prefer- 
ence structure will reflect what is cul- 
turally valued within the group, as 
Riecken and Homans (1954) suggest. 
If academic achievement is valued 
by school children for example, the 
ablest peers (academically) will tend 
to be overchosen.“ As other values 
may compete with achievement for 
peer endorsement, the relation be- 
tween mental ability? and sociometric 
status will always be limited. The 
more intelligent children may also 
prove more competent in their ex- 
pression or realization of competing 
values, however, which leads one to 
expect a persisting relationship be- 
tween mental ability and sociometric 
status. This does not suggest that the 
correlation between ability and socio- 
metric status should be high, but 
rather that it should be ubiquitous. 


2 Mental ability is used throughout this 
paper interchangeably with intelligence. All 
dies reviewed have used measures of 
intellectual performance of the individual as 
against measures of “potential.” In speaking 
of intellectual or mental ability, we hoped to 
avoid the many remedial, diagnostic, and 
other clinical connotations which accrue to 
the concept of the intelligence quotient. We 
are also aware, as our discussion indicates, of 
the need for a better conceptualization of 


mental ability. 
273 


274 
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In separate studies of five different 
samples of children in grades from 
second through seventh, Bonney 
(1944) and Laughlin (1954) found 
positive correlations between in- 
telligence and sociometric status. In 
each case, the association was sig- 
nificant beyond the .01 level, but the 
coefficients (Pearson) were low, rang- 
ing from .31 (N=299) to .27 
(N 525). 

Grossman and Wrighter (1948) re- 
lated intelligence as measured by the 
Stanford-Binet to sociometric status 
in a class of sixth grade students. 
They reported that high status peers 
were significantly higher in intelli- 
gence. They concluded that the two 
variables were significantly related, 
but that high intelligence did not 
assure high status. 

Barbe (1954), Bonney (1946), and 
Potashin (1946) gave further con- 
firmation to these findings, but they 
also found that mutual choices 
tended to be made between children 
with similar levels of mental ability. 
To this extent, the general associa- 
tion between ability and status is re- 
duced insofar as children in the modal 
range of mental ability reflect higher 
sociometric status: there is a greater 
chance of being chosen for subjects of 
modal ability, and the range of status 
scores is restricted. 

Gallagher and Crowder (1957), in a 
study of gifted children, found that 
four out of five students with Stan- 
ford-Binet IQs of 150 or above ob- 
tained above average sociometric 
status and that more than half scored 
in the top status quartile. Thus when 

relatively extreme cases are consid- 
ered, mental ability improves con- 
siderably as a predictor of sociometric 
status. 

Other 


representative or note- 
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worthy studies of the relation among 
normal children are summarized in 
Table 1. When measures of specific 
types of performance such as reading 
comprehension, school achievement, 
motor skills, and social maturity 
have been correlated with sociometric 
status, coefficients of similar magni- 
tude have been obtained. These find- 
ings are reviewed in Gronlund (1959). 


RESEARCH ON INSTITUTIONAL 
RETARDATES 


Using the choice of ''one best 
friend" as a test criterion, Hays 
(1951) investigated the problem with 
a sample of 127 defective, border- 
line, and dull-normal girls housed in a 
single institutional dormitory. His 
subjects ranged in age from 7 to 23 
years, with a mean of 14. Intelligence 
quotients and mental ages were de- 
rived from Stanford-Binet tests. The 
biserial correlation between dichoto- 
mized “choices received” and “no 
choices received" and IQ was .43 
(p «.01). 

Clampitt and Charles (1956) 
studied the relationship between 
sociometric status and supervisory 
evaluation of institutionalized men- 
tally deficient children. The 164 sub- 
jects, both girls and boys, ranged in 
age from 6 to 40 years. The mean age 
for girls was 19, for boys 14. At the 
time of the study, all subjects had 
been institutionalized for at least 
year, and the median term was 
years for girls and 3 years for boys. 
Most of the intelligence tests were 
Stanford-Binets. Sociometric choice 
and rejection responses were ob- 
tained in relation to the following a€* 
tivities: eating, playing, and working 
Significant, positive rank order cof 
relations were found between socio 
metric status and MA, IQ, and 
supervisory evaluation based on 
selected traits, The correlation be 
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TABLE 1 


SUMMARY OF Major STUDIES OF NORMAL CHILDREN 


Choice criterion (or cri- 
teria) as obtained through 
sociometric test. (All 
choices within sample un- 
less states otherwise.) 


Authors 


Bonney (1944)* | Take a trip with, vote for 
class librarian, vote for 
class officers, best friend, 
take picture with, number 
of valentines received, 
have as a partner for 
Easter party, expect to 
give Christmas presents, 
and best citizens and best 
leaders. No limit was made 
on the number of choices. 


Bonney (1946) Same as Bonney (1944) 


Gallagher & 
Crowder (1957) 


5 best friends 


Grossman & 


: First three choices for: 
Wrighter (1948) 


sit near, walk home, play, 
class officer, and best 
friend 


Laughlin (1954) | One of your best friends, 
choosea group member but 
not a close friend, choose 
someone to be with once in 
a while, don't mind this in- 
dividual in group but do 
not want to have anything 
to do with him, and wish 
individual were not in the 
group 

Potashin (1946) | First three choices for: fa- 
vorite activity, classroom 


project, best friend in 
class, and best friend out 
of class 
Rosentha] : : E 
(1956)¢ First three choices for: 


play, sit next to, invite toa 
party, and go to a show 


California Test of 


a 
Barbe (1954) in hi 
by bi 54) in his study of 244 normal childern gives only percen 
ziven ot and slow learning children, No correlation test could be perfo! 


$ 
M 
ihe differ 
$e MA and 10 is .32 and .06, respectively. 


t f 
significans pared high and low sociometric groups on 


S.05), and on four of the tests nonsigni! 


ince in the MA and IQ of friends and nonfriends is 1.2 and 2» 


10 language measures. 
ficantly (p S. 10). 


275 
Correlation 
coefficient 
Measures of N and 
mental ability statistical 
test 


Mental Maturity 
(grade 2), Kullman- 
Anderson (grades 3 
& 4), Otis (grade 5), 
Pintner Intermediate 
(grade 6), Gates Pri- 
mary Reading (grade 
2), Stanford Achieve- 
ment (grades 3, 4, 
5, 6) 


Same as Bonney | 201 | .28-.44 
(1944) Pearson 
Stanford-Binet, 30 | .45 contingency 
WISC, Stanford 

Achievement Test 

Stanford-Binet, 117 “Usual low 
Stanford Achieve- rectilinear n 
ment Test relationship 
Detroit Alpha Intel- | 525 .21-.31 
ligence Test, Metro- Pearson 
politan Achievement 

Test—partial bat- 

tery 

Dominion junior and 124 

intermediate group 

tests 


Kuhlmann-Anderson 358 
language measures 


ce levels of friends chosen 
tages of inte buon by IQ level was not 


to mental age and intelligence. 


difference of friends (mutual choices) and nonfriends were compared e the reliability of the difference 


respectively 
On six of the tests the groups differed 
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TABLE 2 
SUMMARY OF Major STUDIES oF RETARDED CHILDREN 


Authors 


Clampitt & 
Charles (1956) 


Dentler & 
Mackler (1960) 


Farber & 
Marden (1958) 


Hays (1951) 


McDaniel (1960) 


Sutherland, 
Butler, Gibson, 
& Graham 
(1954) 


Choice criterion (or cri- 
teria) as obtained through 
sociometric test. (All 
choices within sample un- 
less stated otherwise.) 


Three choices for: eating, 
playing, and working 


Three (or more) choices 
for: play, work, to be, and 
not want to play with 


Three best friends 


One best friend 


One choice for: sit next to 
at lunch, sit with at movie, 
play, and work 


Correlation 
fficient 
Measures of N s 
mental ability statistical 
test 
Institutional retardates 
Stanford-Binet 164 | .34 Spearman 
Porteus Maze 29 | .50 Pearson 
Stanford-Binet 77 | .40 Spearman 
Stanford-Binet 127 | .43 Biserial 
WISC 15 | .35 Spearman 
Stanford-Binet 205 | .34 Pearson 


Two choices for: working, 
eating, recreational peri- 
ods, spare time activities, 
best friend, take with you 
on discharge, prefer to dis- 
cuss plans and troubles, 
and associate with after 
marriage 


Noninstitutional retardates 


Johnson (1950) 


Turner (1958) 


Three choices for: like, sit 
next to, and play 


Three choices for: sit next 
to and play 


Stanford-Binet (for 
retarded members, 
N —39), Vineland So- 
cial Maturity, New 
California Short- 
Form Test of Mental 
Maturity 


688 


Otis Quick Scoring 
Mental Ability Test, 
Vineland Social Ma- 
turity 


390 


ez 4.94, 


ion 
a [tests 20 typical group with mentally handicapped group as to acceptance (t=4.10, S. 01) and rejecti 
ES 


ity 
b / tests Compared high chosen group with low chosen group on ability (£—2.1, »<.05) and social matur? 
1). 


(t =2.78, p<.0 
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tween IQ and choice status for boys, 
for example, was .34 (p «.01). 
Farber and Marden (1958) studied 
the social organization of a boy's unit 
ata state school for the mentally de- 
ficient. The sample contained boys 
ranging in age from 11 to 19. The 
mean age was 14.7. The length of 
residence ranged from 4 months to 
nearly 15 years. Only boys for whom 
Stanford-Binet IQ scores were avail- 
able were included. For the 77 sub- 
jects, status ranks were ascertained 
by interviews and questionnaires. A 
rank correlation of .40 (p=.01) was 
found between IQ and status. 
Sutherland, Butler, Gibson, and 
Graham (1954) made a sociometric 
study of one cottage of retarded fe- 
males. The 205 subjects ranged in 
age from 18 to 53 years. The pub- 
lished report did not indicate the 
length of institutionalization for the 
cottage members. The Stanford- 
Binet (Form M) was used as the 
ability measure. For their subjects, 
Sutherland et al. found a Pearson 
Coefficient of correlation of .34 
(P<.01) between intelligence and 
choice status. To illustrate the rela- 
tionship, Sutherland et al. compared 
high and low status subjects. The 
mean IQ of the high status group was 
62.6 compared with a mean of 42.8 for 
the lows. Of the highs, 87% were 
above 70 IQ, while in the low status 
group 81% were below 70 IQ. 
These four studies on institutional 
retarded children are corroborative. 
ey compare point for point with 
Studies of normal children and with 
9ne another in finding a positive, sig- 
nificant yet weak association between 
intelligence and sociometric status. 
e Daniel (1960) studied 15 re- 
ates, 3 women and 12 men, who 
ranged in age from 16 to 32 years, 
= a mean age of 19. At the time of 
E. administration of the first socio- 
etric test, the group had been in 


existence for 8 months. The study 
did not indicate the length of insti- 
tutionalization, however. The mean 
IQ for the group, based on the Full 
Scale WISC test, was 52. The Spear- 
man rank correlation between IQ and 
sociometric status was .35 (p<.10). 
Although this coefficient fails to differ 
significantly from zero at the .05 
level, it is better to consider the co- 
efficients under review in terms of 
magnitude and expected range. For 
example, it is likely that McDaniel 
would have found a correlation sig- 
nificant at p equal to or less than .05 
had he assessed a group with 30 
rather than 15 subjects; yet the 
actual magnitude of the coefficient 
would probably have fallen between 
.30 and .50. 

McDaniel's subjects exhibited very 
little interaction and a restricted 
range of choices. Considering the 
"loóseness" of this group's socio- 
metric structure, the relation be- 
tween mental ability and sociometric 
status McDaniel found is all the more 
indicative of the weak but pervasive 
character of differentiation by mental 
ability. Even where group position 
appears to have limited salience for 
members, social preference depends 
somewhat upon the demonstration of 
skills of value in group activity. 

Dentler and Mackler (1960, 
1961) investigated mental abilities 
in relation to sociometric status 
among 29 newly arrived boys in a 
state school for retarded children. 
The boys ranged in age from 6 to 12; 
mean IQ on the Porteus Maze Test 
was 56. After the first month of resi- 
dence, sociometric and psychometric 
measures were taken. The associa- 
tion between mental ability! and 


sociometric status was .50 (p <.01, 


4 A full scale score was obtained from T 
scale scores On the Porteus, the Parsons 
Language Sample, and an index of social 


maturity- 
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Pearson r). In the second month, 
repeated sociometric assessment re- 
vealed a correlation between choice 
status and ability of —.14. Further 
analysis indicated that, under in- 
creased pressure from aides to restrict 
peer interaction and to induce con- 
formity to cottage regulations, group 
structure was reorganized (at least 
temporarily) so that sociometric 
status became increasingly associated 
with conformity. 


RESEARCH ON NONINSTITUTIONAL 
RETARDATES 


Johnson (1950) conducted a study 
in two communities in which there 
were no special classes for the men- 
tally retarded, thus assuring that all 
the educable, mentally retarded were 
in regular classrooms. Grades 1 
through 5 were sampled. He found 
that children diagnosed as mentally 
handicapped obtained significantly 
lower sociometric status scores than 
the nonhandicapped. In addition, 
Johnson found that  sociometric 
status was directly related to IQ and 
that rejection scores were inversely 
related to IQ. 

Turner (1958) studied the socio- 
metric status of mentally retarded 
children enrolled in special classes in 
Negro elementary schools in North 
Carolina. In all, 18 classes were sam- 
pled and 390 children tested. Using 
three measures to assess mental abil- 
ity (Table 2), Turner found that high 
ability children were chosen 15 or 
more times and the lows 3 times or 
less, using roughly the top fourth and 
the bottom fourth of the subjects 
ranked on mental ability. 


MEASURES OF SOCIOMETRIC STATUS 


The techniques and choice criteria 
used to measure sociometric status in 
the papers reviewed vary greatly. 
Barbe (1954) had teachers ask their 
pupils to nominate their three best 
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friends, and gave equal weight to 
each choice. Gallagher and Crowder 
(1957) asked for nomination of five 
best friends and ranked subjects by 
number of choices received. Bonney 
(1944, 1946) in marked contrast, 
used event-specific criteria. His sub- 
jects chose peers with whom they 
would most like to have their pic- 
tures taken, those they would pre- 
fer to work with on a committee 
for a social event, have as partners 
for a trip to a packing house, and so 
forth, across three additional cri- 
teria. Moreover, number of valen- 
tines received on Valentine's Day 
was tabulated and included as an in- 
dicator. Criteria were varied from 
class level to level. Scores were 
weighted, and number of choices was 
notlimited. Resulting frequencies of 
choices received were calculated as 
proportions of totals. 

Grossman and Wrighter (1948) 
used 10 choice criteria. They were 
thus able to assess internal reliability 
and found a mean Spearman-Brown 
reliability coefficient for four samples 
of .95. Validity was checked by 
examining the fit between socio- 
metric status and children elected as 
class officers. Scores were weighted 
and number of choices received on 
each criterion were summed. 

In studies involving institutional 
retarded children, measures must be 
fitted to surmount illiteracy as well a$ 
other handicaps. Clampitt and 
Charles (1956) used three choice cri" 
teria: eating associates, playmates, 
and workmates. Choices were elicited 
in interviews, and probes were used to 
clarify communication and to insure 
at least three choices on each cri- 
terion. Number of choices was across 
the three criteria, with rejections be, 
ing weighted negatively. Farber an 
Marden (1958) interviewed their sub- 
jects and elicited unlimited nomina" 
tions of best friends. If a subject 
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named only persons not included in 
the sample, he was asked to name his 
best friends within the group. First, 
second, and third choices were 
weighted. Hays (1951) also used in- 
dividual interviews but asked only for 
choice of one best friend. Number of 
choices received ranged from zero to 
eight. Biserial correlation was neces- 
sary because of the extremely skewed 
distribution. McDaniel (1960) in- 
terviewed his institutional subjects 
but employed six criteria, including 
preferred associates to lunch with, sit 
with at the movies, play with, work 
with, help on a job, and persons nomi- 
nated as those with whom the sub- 
ject would not do any of these things. 
Only one choice was elicited on the 
first five criteria. Subjects were 
ranked by total number of choices 
received. Interestingly, no rejection 
nominations were made. McDaniel 
retested the group and obtained a 
response stability coefficient of .60 
(Spearman). Sutherland and assoc- 
ciates (1954) also interviewed their in- 
stitutional subjects, eliciting two 
choices on each of eight criteria. 
Most of these were identical to those 
used by McDaniel but best friends 
were nominated as well as choices of 
associates preferred after institu- 
tional release. Of all the studies of in- 
stitutional subjects, only Suther- 
land's employed a probability model 
(Bronfenbrenner, 1945) to categorize 
Subjects by status level. 

Dentler and Mackler (1960) at- 
ported to simplify choice elicita- 
lon by using photographs of all 
Sroup subjects randomly arrayed in 
Tows and columns on a large but 
Cove beaverboard. Children were 
io under informal condi- 
Mee asked to point to their 
Wink $ on four criteria: playmates, 
es mates, most want to be, and 
R EU to play with. Number of 

es received per criterion was 


normalized and resulting scores were 
combined. 

In their studies of sociometric 
status of noninstitutional retarded 
children, Johnson (1950) and Turner 
(1958) asked for friendship nomina- 
tions, playmates, and classroom 
seatmates. Both obtained their data 
through personal interviews but only 
Johnson secured rejection choices. 
Again, choices were not weighted, and 
summed number of choices served as 
the scale. 


DISCUSSION 


This review suggests that a mora- 
torium could be declared on studies of 
the relation between intelligence and 
sociometric status among children— 
a moratorium that ought to hold 
whether the children are gifted, nor- 
mal, or mentally retarded, and 
whether or not they are institutional- 
ized. The relationship has been dem- 
onstrated to hold, and to hold at a 
characteristic level, on samples rang- 
ing in size from 15 to more than 500, 
and across a wide age range. Its 
strength tends toward a constant 
value even where group relations 
have not become well established. 
Among retarded children in institu- 
tions, length of institutional residence 
appears to have little effect on the 
general relationship. The coefficients 
hold whether the intelligence test 1s 
the Stanford-Binet, the WISC, the 
California test, or the Porteus Maze, 
and whether it is individually or 
group administered. Finally, the 
relation is roughly the same whether 
IQ or mental age is used. 

Most of the studies employed the 
best available instruments for assess- 
ing school related aspects of intelli- 
gence. In research involving re- 
tarded children, however, it is im- 
portant to exploit the fact that a large 
variety of abilities exist and that 
some of these are probably more 
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closely related to sociometric status 
than others. For example, Rosenthal 
(1956) found that the language of 
children of high sociometric status 
was more active and moving and 
more varied. 

It should be possible to differen- 
tiate relations between a variety of 
abilities and a variety of types of 
sociometric statuses. Thus function- 
ing abilties such as performance sub- 
scales on intelligence tests or motor 
skills might be very highly associated 
with sociometric status on criteria 
involving leisure association or play- 
mate preferences. To avoid prema- 
ture closure of the question of how 
abilities relate to group status, at this 
stage it may be much more useful to 
consider discrete performance meas- 
ures and discrete status indicators. 

These considerations regarding the 
"true" range of group related abilities 
apply also to studies of normal chil- 
dren in classroom situations. As 
Gronlund (1959) indicates, research 
on measures of ability in relation to 
sociometric status among normals 
has accumulated steadily since 1940, 
yet little has been done to develop 
measures of the kinds of abilities that 
might be assumed, on the basis of 
hypotheses about group structure, to 
have peculiar relevance as determi- 
nants of status. Most projects have 
employed instruments developed orig- 
inally by educational psychologists 
and clinicians for very different 
purposes. 

Some factors underlying the low 
correlation between mental ability 
and status are methodological; others 
are substantive. Only one of the 
studies reviewed made use of a proba- 
bility model for classifying students 
as high, medium, or low in status, 
suggesting that much of the pre- 
sumed differentiation between sub- 
jects may be due to little more than 
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chance or error variance. Similarly, 
only one study employed methods of 
computation which took into ac 
count the ones making choices as well 
as how many choices were received. 
Heavy reciprocation in the modal 
range could distort and depress esti- 
mates of true association. Only one 
study assessed in detail the interrela- 
tions between results on the several 
choice criteria. A few more under- 
took evaluation of the reliability of | 
the responses, though unfortunately, 
three of the studies treated repeated 
measurements of sociometric status 
as tests of reliability rather than as 
indicators of change. Users of socio- 
metric measures should accept the 
probability of change over time. The 
task of specifying the elements of 
change in test data that may be at- 
tributed to the actual changes in the 
variable under study, as opposed to 
change that must be attributed to 
unwanted or chance flunctuation in 
the test, is a task that has not been 
undertaken in the research under re- 
view. 

Despite the wide range of work at- 
testing the reliability and validity of 
diverse measures of  sociometric 
status (Mouton, Blake, & Fruchter, - 
1960a, 1960b), there is little doubt 
about the need for clarification of the 
concept of sociometric status. There 
is agreement that sociometric status 
gives an indication of the differential 
value peers tend to place on each 
other. There is also agreement that 
groups have standards against which 
such valuation is made. To this ex- 
tent, clarification by empirical m 
should be possible through closer at- ! 
tention to these norms or standards. 
There is no good basis for demanding | 
that choice criteria should be highly 
intercorrelated or even highly stab! 
over time, but there is theoreticé 
basis for investigating the fit between 
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criteria and group standards. For 
example, none of the studies reviewed 
found length of residence in the in- 
stitution to be a qualifying variable, 
yet one of the concerns of the sociom- 
etrist should be the analysis of “in- 
stitutional effects." How do groups of 
children within institutions develop 
structure, and how are these struc- 
tures influenced by the crucial fact of 
their location within an institu- 
tional culture? 

Length of residence may be de- 
terminative if approached longi- 
tudinally. Group structures are 
emergents; thus, sociometric status 
within an institutional cottage which 
has just formed may differ greatly 
from status in a cottage that has en- 
dured for years. The problem is one 
not of length of residence of individ- 
uals perhaps, but of duration of the 
group, as McDaniel (1960), Farber 
and Marden (1958), and Dentler and 
Mackler (1960) suggest. 

The study by Farber and Marden 
(1958) points a path that should be 
followed. Beyond finding association 
between intelligence and status, these 
investigators demonstrated associa- 
tions between sociometric status and 
formal classification of institutional 
boys as educables, trainables, and 
working-boys; status and popularity; 
and status and history of delin- 
quency. They identified the bases on 
which institutional retarded children 
in at least one state school classify 
themselves, For example, their sub- 
Jects distinguished between peers 
ted to rehabilitation and those 

Isposed toward a “custodial career.“ 
demie status was shown to be 
Ssociated with such classifications. 
Tr Status serves as a key to un- 
5 ing the career paths pro- 
bon y the institution and chosen 
dia patients. Though their study 
$ not treat changes in status, Far- 
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ber and Marden have developed 
means for predicting the future be- 
havior of institutional retarded boys. 
Dentler and Mackler (1961), in 
studying changes in  sociometric 
status among newly arrived patients, 
found that as the culture of the insti- 
tution was absorbed, the relation be- 
tween mental ability and status came 
to depend increasingly upon stand- 
ards imposed on the group by aides. 
For at least a brief period, the usual 
positive relation between mental age 
and sociometric status was reversed. 
The mentally abler boys resisted the 
regulations most strongly and lost 
status as a result of deviance. 
Future sociometric research on the 
differentiation of members within 
children's groups should specify with 
greater precision the nature of the 
performance or ability under assess- 
ment and the particular variety of 
status. This effort should be linked 
with collection of data relevant to the 
development, situation, and norma- 
tive content of the group. Global or 
general indicators should be aban- 


doned. 
SUMMARY 


A review of representative studies ' 
of the relation between ability and 
sociometric status among normal 
children, institutional mentally re- 
tarded, and noninstitutional retarded 
children, indicated high agreement 
with the generalization. that indi- 
vidual ability is positively and sig- 
nificantly associated with choice 
status. Studies of normal children 
have demonstrated that this relation 
holds whether the abilities assessed 
are measures of mental age, intelli- 
gence quotients, Or quite different 
measures of achievement, or social or 
motor skills. Although significant, 
the association is uniformly limited to 
the .25 to .50 range. 
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Sociometric status studies of in- 
stitutional retarded children were 
viewed as particularly important, as 
they provide access to the study of in- 
stitutional effects and practical eval- 
uations of social rehabilitation. Socio- 
metric research on institutional chil- 
dren has been limited to correlation 
analysis of relations between socio- 
metric status and school-type intelli- 
gence tests, length of residence, and 
age, with the exception of but a few 
reports. 
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The reviewers proposed that future 
studies should sharpen the concept 
of mental ability or include dimen- 
sions that concepts of group structure 
suggestsare of probable importancein 
a given situation. Studies that attend 
exclusively to the relation between 
intelligence and status should be 
avoided, while efforts to predict 
status within groups undergoing 
formation or change should be in- 
creased. 
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RESPONSE STYLE AS A PERSONALITY VARIABLE: 
BY WHAT CRITERI ON? 


RICHARD K. McGEE! 
Moccasin Bend Psychiatric Hospital, Chattanooga, Tennessee 


Without doubt, one of the most ac- 

tive research areas in psychology dur- 
ing the last decade has been the study 
of test taking response sets, or styles. 
In particular, attention has been de- 
voted to investigating their influence 
on personality inventory scores. This 
work has been confined almost ex- 
clusively to only three types of re- 
sponse tendency: the social desirabil- 
ity set, characterized by the con- 
sistent endorsement of desirable 
traits and the denial of undesirable 
ones; the deviation of a pattern of 
scores from the typical pattern pro- 
duced by a given population of re- 
sponders; and the acquiescence set, 
which consists of tendencies to choose 
the “true,” “agree,” or “like” option 
rather than their respective negative 
alternatives. Jackson and Messick 
(1958) have reviewed the research in 
this area, and have outlined their 
own suggestions for directing the 
course of future investigations. The 
purpose of the present review is to 
discuss one of the specific trends 
which has appeared in the area, sub- 
sequent to, and perhaps largely 
attributable to the Jackson and 
Messick (1958) article. 

The development of particular in- 
terest is in the utility of the response 
set component of test scores. Where- 
as Lentz (1938) and Cronbach (1946, 
1950) urged the control or elimina- 
tion of noncontent determined vari- 


1 The author is indebted to Douglas Jack- 
son and Lee Sechrest for reviewing the original 
manuscript and offering their valuable sugges- 
tions for the final draft. Appreciation is also 
expressed to H. J. Wahler for initially stim- 
ulating the author's interest in response style 
research. 


ance, Jackson and Messick (1958) 
suggest that for certain purposes in 
personality assessment opportunities 
for the expression of personal modes 
for responding should be enhanced 
and capitalized upon" (p. 244) 
Thus, the recent trend referred to 
above is based on the thesis that a 
response style has its roots in the un- 
derlying personality complex of the 
responder. It is proposed that in- 
dividuals who vary in the extent to 
which they manifest a particular 
style of responding, will also vary in 
terms of certain measurable personal- 
ity traits. Various dimensions of 
personality have been suggested, and 
evidence collected to support this 
hypothesis. The most recent and 
most provocative article in this series 
(Couch & Keniston, 1960) concludes 
on the assertion that 

this integrated study ... has demonstrated 
both the far-reaching importance of response 
set in the area of psychological tests and the 
major proposition that the agreeing response 
tendency is based on a central personality 
syndrome (p. 173). 


The relationship between response 
styles and personality traits appears 
to be a most promising problem for 
investigations in the near future. 
However, it is clear, even at this early 
stage, that the already available 
literature offers important implica- 
tions for the design and execution of 
future studies. It is to this question 
that the present review is addressed. 


PERSONALITY CORRELATES OF THE 
Socrar DESIRABILITY RESPONSE 
STYLE 

No effort will be made here to re- 
view the vast literature which h 
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accumulated on this topic since Ed- 
wards (1953) reported a correlation 
of .87 between the scaled social de- 
sirability of item content and the fre- 
quency with which it is endorsed. In 
general, research with this response 
style has continued to be directed at 
showing its influence on a variety of 
psychological inventories (Bendig, 
1959; Cowen & Tongas, 1959; Taylor, 
1959, 1961), or its appearance in 
various clinical groups (Wahler, 
1958). Recently research designed to 
study the complexity of the social 
desirability response style has re- 
vealed its multidimensional char- 
acter (Messick, 1960a) and its inter- 
action with other response styles, 
particularly acquiescence (Jackson, 
1960; Jackson & Messick, in press; 
Messick, 1960b; Messick & Jackson, 
1961). 

The present literature contains 
only two suggestions that social 
desirability responding is related to 
basic personality traits. The first 
of these is barely more than a tenta- 
tive guess. Allison and Hunt (1959) 
vestigated the relationship between 
social desirability responding and the 
expression of aggression with various 
degrees of frustration. Social desira- 
9 5 tendency was measured by the 
6 . Social Desirability scale 

Scale) and correlated with ex- 
ban sion-of-aggression scores from a 
E o pencil situational frustra- 
Too Wem In two experiments, they 
oue E high SD scale scores are 
Bos. with a suppression. of 
where 15 in ambiguous situations 
1 e culturally acceptable re- 
9 5 unspecified. Their tentative 
desrabilic was that high social 
Abies n are found in 
Weiss: i o are other-directed, 
encies il ees desirability tend- 
individuals erize inner- directed 
i E and Marlowe (1960) crit- 
i € usual approach to the 


social desirability response style. 
They point out that one is never 
certain when to invoke the more 
parsimonious explanation that denial 
of undesirable traits is due to the 
genuine absence of the psychiat- 
ric symptoms usually embodied in 
the self-description inventories sup- 
posedly most affected by this tend- 
ency. Similarly, endorsement of 
desirable traits may reflect either 
defensiveness, or candid  self-ap- 
praisal, especially in college students.* 
Hence, they offer a substitute method 
for measuring the social desirability 
tendency. The Marlowe-Crowne 
Social Desirability (M-C SD) scale 
is a 33-item inventory. Like the 
MMPI L scale, these items suggest 
behaviors which while socially desira- 
ble, cannot be endorsed by most peo- 
ple, if they are answering truthfully. 
It is the authors’ contention that 
social desirability responding rests on 
a basic need of the individual to be 
accepted and approved of socially. 
To test this notion, Marlowe and 
Crowne (1961) studied the relation- 
ship between SD scores and be- 
havioral tasks in the laboratory. 
They employed the Spool Packing 
task developed by Festinger and 
Carlsmith (1959). This is a boring, 
seemingly meaningless task designed 
to arouse negative, antagonistic feel- 
ings. They predicted that individ- 
uals with high SD scores have a 
strong need for social approval, and 


? This point is made by Crowne and Mar- 
lowe with specific reference to responses made 
by college students serving as subjects in 
social desirability research projects. It is not 
necessarily a criticism of the usual clinical in- 
terpretations of social desirability scores, such 
as Wahler's (1958) prognostic index à for 
psychotherapy candidates, or the various 
MMPI scales designed to assess defensiveness. 
However, two recent studies (Jackson & 
Messick, in press; Messick & Jackson, 1961) 
have demonstrated the response set influence 
in the MMPI, and discussed important im- 
plications for the validity of these scales as 
they are currently used clinically. 
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would thus hold favorable attitudes 
toward the experimental situation 
following the Spool Packing task. 
More negative attitudes were pre- 
dicted for the low SD scorers. SD 
scale scores were dichotomized at the 
mean, and these two groups were 
shown to differ significantly (p «.01) 
on the attitudes they expressed to- 
ward the task. The difference was in 
the predicted direction, with the high 
SD group expressing more favorable 
attitudes than the low SD group. 
'They also observed a correlation of 
—.54 between the M-C SD scale and 
the Barron Independence of Judg- 
ment scale (1953) which is designed 
to measure social conformity. 

To follow up this latter finding, 
Strickland (1960) administered the 
SD scale to subjects and also observed 
their behavior in an actual conformity 
situation similar to the original Asch 
(1956) procedure. When the SD 
scores were again dichotomized at 
the mean, the groups differed signif- 
icantly (p<.005) on the basis of 
their yielding scores, with yielders 
having the higher need for social ap- 
proval. 

The preceding data are not pre- 
sented here for the purpose of show- 
ing the construct validity of the M-C 
SD scale. These studies are impor- 
tant in that they represent a major 
attempt to relate social desirability 
responding to underlying personality 
variables. It is even more important 
that these studies have employed a 
procedure which is rare in the area of 
response style research. Seldom does 
one find studies wherein the stylistic 
variable is correlated with methodo- 
logically independent observations. 
The typical procedure has been to 
employ as criteria, other psycho- 
metric instruments containing a pos- 

sibly strong methodological contami- 
nation. This attempt to seek an inde- 
pendent criterion is a highly valued 
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step in psychological research, and is 
a point on which this review will 
focus later. 


PERSONALITY CORRELATES OF 
DEVIANT RESPONSE PATTERNS 


Several studies have appeared in 
recent years dealing with general as- 
pects of personality which relate to 
deviant responding in a variety of 
stimulus situations. This research 
has usually been presented as an at- 
tempt to validate the Deviation 
Hypothesis. Berg (1955, 1957, 1958, 
1959) has formulated the Deviation 
Hypothesis notion as an extension of 
the concept of “set” or Einstellung. 
He suggests (1959) that it serves as 
a unifying principle to account for 
the results of many disparate studies 
seeking to predict behavior under 
widely varying conditions. In sim- 
plest form the Deviation Hypothesis 
asserts that deviant behavior is gen- 
eral; that deviant responses occurring 
in one “uncritical” area of behavior 
predict the occurrence of deviant re- 
sponses in other ‘‘critical’’ areas. In 
a recent paper Sechrest and Jackson 
(1960) pointed out the broad gen- 
erality of Berg's notion. 

Ithas been suggested that psychotics, lawyers, 
cardiac patients, transvestites, young normal 
children, character disorders, the obese, the 
feeble minded, psychoneurotics, and persons 
suffering from constipation, among others, 
represent deviant groups which might be ex- 
pected to manifest their particular propensi- 
ties toward deviation not only in a modality 
relevant to their particular symptoms and to 
items with relevant content, but also in re- 
sponse to one or more of the following: prefer- 
ence for abstract drawings, food aversion 
questionnaires, stimuli for conditioned Te 
sponses, autokinetic and spiral aftereffect sit- 
uations, vocabulary test items, figure draw: 


ings, musical sounds, and olfactory stimuli 
(p. 2). 

Evidence to support this position 
has been presented by Grigg an 
Thorpe (1960). They administered 
the Gough 300-item adjective check- 
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list to a college freshman class and 
computed the frequency with which 
each item was checked by students as 
being self-descriptive. "Those adjec- 
tives endorsed by more than 86% and 
those checked by fewer than 14% of 
thestudents were selected and formed 
a final 72-item list. This adjective 
checklist was then presented to the 
next incoming freshman class, and 
their self-descriptions obtained. De- 
viation scores were computed for 
each student by counting the number 
of commonly checked adjectives 
which he omitted, plus the number 
of rarely endorsed items he checked. 
At the end of the academic year 
students in this sample who appeared 
at the counseling center for voca- 
tional guidance, or for personal coun- 
seling, or who sought private psychi- 
atric care in the community were 
identified. A control group was 
randomly selected from among the 
freshmen who did not fall in any of 
these three categories. When the 
deviation scores were compared for 
these four groups it was found that 
those seeking either private psychi- 
atric care or personal counseling for 
€motional problems had significantly 
higher scores ( <.01) than those who 
sought only vocational guidance, or 
no help at all. The two higher 
eres were not significantly differ- 
nt from one another, nor were the 

two lower groups. 
bahis study is the most recent in- 
^ gation of the type initiated by 
v and Collier (1953). They 
Eds B red that the tendency to 
a Teme responses was a deviant 
igh 5 which would differentiate 
ois re males from low anxiety 
ee ee used the ambiguous pic- 
es of the Perceptual Reaction Test 

T) (Berg, H 

aSa rek g, Hunt, & Barnes, 1949) 
ü 955) «jean set measure. Barnes 
Patient monstrated that psychiatric 
with various diagnostic labels 
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could be differentiated from one 
another, and from normal control 
subjects on the basis of their pattern 
of responses to the PRT. 

It should be evident that the 
studies reviewed here are dissimilar 
in one major respect to others in this 
area; namely, there is an absence of 
any attempt to specify a particular 
trait or personality variable which is 
related to deviant responding, and 
therefore considered basic to it. 
While Berg has clearly shown that 
different personalities differ in the 
response pattern they produce, he 
has apparently not considered it 
necessary to hypothesize an under- 
lying personality trait, and show how 
it relates to deviant responding. 
However, Berg (1959) has acknowl- 
edged several unresolved questions in 
his unfinished work. He suggests 
that the forthcoming research may be 
expected to provide important data 
concerning the environmental, per- 
sonality, and biochemical variables 
which may be related to atypical re- 
sponse patterns. 

Sechrest and Jackson (1960) con- 
sider the deviation of response pat- 
terns from group to group to be an 
extremely important research area 
with far-reaching implications for 
personality assessment. Yet, having 
learned from the "school of hard 
knocks and tough breaks” that psy- 
chological processes are not always 
simple unidimensional variables, they 
voice a healthy skepticism that the 
Hypothesis is really as general as 
Berg would apparently have his 
audience believe. Fortunately, Berg 
(1959) identified, as one of the major 
sources of difficulty in his work, the 
lack of operationally clean criteria for 
the identification of deviant and 
criterion groups. He has stressed the 
importance of selecting these groups 
on the basis of valid behavioral char- 
acteristics. It is to be expected that 
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future studies will attempt to show 
via operational criteria the extent to 
which the Deviation Hypothesis is 
applicable to measuring specific per- 
sonality traits, and the conditions 
under which its generality is limited. 


PERSONALITY CORRELATES OF 
RESPONSE ACQUIESCENCE 


The tendency to respond “yes,” 

“agree,” or “true” to personality in- 
ventory items irrespective of their 
content has been the subject of many 
studies in recent years. In reviewing 
this area, Jackson and Messick (1958) 
concluded: 
In the light of accumulating evidence it seems 
likely that the major common factors in per- 
sonality inventories of the true-false or agree- 
disagree type, such as the MMPI and the 
California Psychological Inventory, are in- 
terpretable primarily in terms of style rather 
than specific item content (p. 247). 


In line with the extensive interest in 
response acquiescence per se, there 
have been numerous suggestions that 
herein lies a new device for observing 
systematic behavior which will lead 
to valid inferences about the nature 
of a particular "black box.” 


Authoritarians and Conformers 


The acquiescence set first attracted 
major attention in connection with 
its influence on the California F 
Scale (Adorno, Frenkel-Brunswik, 
Levinson, & Sanford, 1950). Thus it 
is quite natural that acquiescence has 
been closely linked with the trait of 
authoritarianism. When it was no 
longer reinforcing to crucify the F 
Scale because of its susceptibility to 
response style influence (Bass, 1955, 
1957; Chapman & Campbell, 1957; 
Cohn, 1953; Jackson & Messick, 
1957; Jackson, Messick, & Solley, 
1957; Messick & Jackson, 1957) at- 
tention turned to the question of a 
psychological (as well as mechani- 
cal) relationship between response 
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acquiescence and authoritarianism. 
Leavitt, Hax, and Roche (1955) sug- 
gested that the confounding of acqui- 
escence and authoritarianism in the 
F Scale was a lucky accident which 
increased the discriminating power of 
the instrument. This conclusion was 
based on their view that the tendency 
to agree with things said in an au- 
thoritative manner is itself a factor of 
the authoritarian personality. Gage 
and his associates (Gage & Chatter- 
jee, 1960; Gage, Leavitt, & Stone, 
1957) have argued that negative 
items have more validity for measur- 
ing authoritarianism than do positive 
items. Their reasoning rests on cer- 
tain assumptions about response ac- 
quiescence. They point out that dis- 
agreeing requires more self-confi- 
dence, ego strength, and personal 
security than does the act of agreeing. 
Hence, acquiescence is one of a family 
of traits including authoritarianism, 
conformity, or obeisance to author- 
ity. The person who responds in an 
acquiescent manner essentially yields 
to the “authority” of the printed 
word, or the physical stimulus how- 
ever presented. 

In a series of research projects 
Jackson (1955, 1958, 1959) has ac 
cumulated data which tend to con 
firm those logical“ arguments mace 
by Gage cited above. In his earlier 
studies Jackson (1955, 1958) formu- 
lated a theory of cognitive energy: 
This hypothetical construct is 1 
ferred from a person's ability to resist 
field forces presented by stimuli in 
his environment. The number ° 
perceptual shifts made by revers! e 
figures under instructions to hold one 
phase represents an operational meas- 
ure of a subject's resistance to hypo; 
thetical forces in the perceptual eld. 
Jackson demonstrated that this meas- 
ure is positively correlated with socia 
conformity. Individuals who are è 
to “hold” the Necker cube in t 
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position instructed also have high 
independence scores on the Inde- 
pendence of Judgment scale; subjects 
who have less resistance to the tend- 
ency to perceive changes in the posi- 
tion of the cube are also yielders, or 
conformers. Recently, Jackson (1959) 
has shown that resistance to these 
field forces is associated with acquies- 
cence response tendency as measured 
by F Scale scores. High acquiescers 
are low in cognitive energy whereas 
nonacquiescers are high in the energy 
required to resist the reversing of the 
figures. Thus, Jackson has presented 
empirical evidence to show that ac- 
quiescers are both conformers, and 
possessers of limited personality 
strength or energy. In so doing, he 
confitmed two of the predictions 
made by Gage and his associates, 
referred to above. Certainly his data 
are more convincing than those which 
led Bass (1956) to the tentative con- 
clusion that the person with a high 
social acquiescence score is an out- 
Ward-oriented, insensitive, non-in- 
tellectual, socially uncritical indi- 
vidual ; in short, a Babbitt—an un- 
questioning conformer to social de- 
mands placed upon him" (p. 297). 


Noncritical Thinkers 


While Jackson has shown that ac- 
quiescence relates to a general proc- 
E. of cognitive functioning, specific 
Ene has been made to the rela- 
ini ip. between response style 
eg or analytical thinking. 
ae s sen and Messick (1958) em- 
5 T: Helmstadter’s (1957) method 
ene ig the content component 
1 e response set component of 
A E They observed the vari- 
thi E response set in relation- 
ben s of the personality scales 
i ersonality Research Inven- 
the, (Saunders, 1955). In general 
me 19 080 low correlations, but con- 

ed that their data suggested the 


possibility of using response sets to 
measure some personality variables. 
Of particular interest in their report 
is the attention given to the trait of 
"criticalness," defined in terms of 
tasks employed in the study. It was 
shown that a set to be critical could 
be effectively induced for some tasks, 
but also that a significant (albeit low) 
negative correlation existed between 
criticalness and acquiescence meas- 
ured by the F Scale. Hence, this 
would tend to corroborate Bass’ 
(1956) assertion regarding the un- 
critical acceptance of situations by 
acquiescers. It might also be taken as 
confirmation of Jackson's notion re- 
lating cognitive energy level to non- 
acquiescence, assuming that critical 
or analytical thinking requires the 
exertion of a relatively high level of 
effort. 

Additional data showing the inter- 
action between acquiescence and cer- 
tain cognitive variables have been 
published by Messick and Frederik- 
sen (1959). They showed negative 
relationships between acquiescence to 
the F Scale (both original and re- 
versed forms) and verbal knowledge, 
general reasoning, and deductive 
thinking. Previously Hardy (1956) 
had shown that certain scales of the 
CPI significantly predict academic 
achievement in a midwestern college 
population. Jackson (1960) observed 
that a feature which these scales had 
in common was a large number of 
items keyed “false,” particularly 
items for which a "true" response 
would have been undesirable. Com- 
bining these separate observations, 
Jackson and Pacine (1960) reasoned 
that an acquiescence style, moderated 
by item desirability, should have a 
relationship to academic achieve- 
ment. They examined this hypothe- 
sis and found that a criticalness style 
did predict grade-point averages toa 
low but significant degree. Acquies- 
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cence scores on modified F Scale 
forms did not predict academic 
achievement, but showed significant 
negative correlations with verbal 
knowledge, and consistent (not al- 
ways significant) negative relation- 
ships with general reasoning. 

'There appear to be rapidly expand- 
ing pools of data which indicate that 
there are stable and meaningful rela- 
tionships between the response deter- 
minants stimulated by true-false or 
agree-disagree item forms and meas- 
ures of important cognitive variables. 


Yeasayers and Naysayers 


Perhaps the most ambitious re- 
search on the personality correlates of 
the acquiescence style has been pre- 
sented by Couch and  Keniston 
(1960). They combined 681 items 
Írom several personality inventories. 
A factor analysis of responses to these 
items yielded a 360-item agreement 
factor which the authors labeled the 
Overall Agreement Scale (OAS). 
With additional measures, they found 
positive correlations between OAS 
and scales with a high proportion of 
responses keyed true. (Where the 
greater proportion of items was keyed 
false, the correlations with OAS were 
negative.) Traitwise, high OAS was 
associated with measures of impul- 
sivity, dependency, anxiety, mania, 
anal resentment, and anal preoccupa- 
tion; low OAS was associated with 
ego strength, stability, responsibility, 
tolerance, and impulse control. 

In addition to correlating OAS 
with these paper-and-pencil measures 
the authors made a searching clinical 
evaluation of their extreme respond- 
ers. Subjects were selected from each 
tail of the distribution of OAS scores; 
high scorers were identified as yea- 

sayers, and low scorers were labeled 
naysayers. Each subject then filled 
out a 55-item incomplete sentence 
form and participated in a depth in- 
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terview lasting from 2 to 4 hours dur- 
ing which time the experimenter 
focused on each of these 55 projective 
responses. Following the interview, 
the experimenter rated each response 
on five separate scales, indicating the 
extent to which the response was 
typical of the theoretical yeasayer, or 
the theoretical naysayer. Interviews 
were “blind” with respect to knowl- 
edge of the subject's OAS score, and 
subjects were randomly divided 
among interviewers. Results revealed 
clear differences between the ratings 
made for yeasayers and naysayers. 
The authors describe these differences 
using the typical abstract clinical 
language: yeasayers are impulsive, 
emotionally reactive, extraverted, ex- 
ternally oriented, low in psychologi- 
calinertia, and possess passive egos; 
naysayers are guarded, defensive, 
constricted, inhibited, introverted, 
withdrawing, introspective, high in 
psychological inertia, slow and critical 
reactors, and possess active egos. In 
summarizing their report, the au- 
thors consider the dimension of Stimu- 
lus Acceptance versus Stimulus Rejec- 
Hon to be the best single construct 
subsuming all the other specific traits 
related to agreeing response style. 
The similarity of this position to that 
of Bass (1956), Frederiksen and 
Messick (1958), and Jackson (1959) 
is quite significant. 

Webster (1960) utilized the Couch 
and Keniston (1960) stimulus ac- 
ceptance-rejection concept with his 
own speculation that response set 
(RS) variance is related to an inhibi- 
Hon versus lack of inhibition dimen- 
sion. He concluded that another "all 
pervasive syndrome“ is being iso- 
lated for the understanding of per- 
sonality. This conclusion was base 
on Webster's data which showed that 
RS had high negative correlations 
with measures of Schizoid Function. 
ing and Impulse Expression. His R 


. Of their research. 
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scores are determined by the fre- 
quency with which the subjects re- 
spond ''no" so as to deny undesirable 
traits implying psychopathology. In 
line with the Crowne and Marlowe 
(1960) argument cited earlier, one 
would naturally predict these find- 
ings. But Webster carries his inter- 
pretation further: 

Finally .. . it becomes clearer why RS is a 
measure of inhibition; both these correlating 
scales measure lack of inhibition or control. 
In particular, Schizoid Functioning measures 
a kind of ego-diffusion which is very typical of 
the undercontrolled college student (p. 5). 


In summary, there appears to be a 
general agreement in the literature 
that there is a trait of response ac- 
quiescence, and that it is probably 
closely related to some personality 
variable. There is also high agree- 
ment as to what to call this variable, 
or what kind of dimension to put it 
on: acquiescers are stimulus accept- 
ing, uninhibited, conformers; non- 
acquiescers are stimulus rejecting, 
inhibited, independents. 


THE PRESENT STATE OF AFFAIRS 


Throughout their provocative dis- 
cussion of acquiescence and person- 
ality variables, Couch and Keniston 
moved progressively further up the 
abstraction ladder. Beginning with 
individual item responses they moved 
Via factor analysis, projective testing, 
and depth interviewing to the level of 
€go functioning and psychological 
inertia.” Their progression was not 
unique, for it paralleled the route 
me by Bass from the endorsement 
d Proverbs via correlational proce- 

ures to "Babbittism!" This com- 
th 5 not intended to take issue 
$ the language used by former in- 

estigators, nor to debunk the design 
A Such language, 
zu abstract, neverthelesscommuni- 
ow ideas and feelings to a large 

lence of psychologists, particu- 
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larly clinicians. Likewise such re- 
search is an integral part of the in- 
ductive process of theory building, 
which is a valuable means to an end. 

However, left in this present state, 
the task is unfinished. To assume 
that the personality correlates of re- 
sponse acquiescence have been iden- 
tified is to make the present collection 
of inductive research findings and end 
in itself. The task remaining should 
be obvious, i.e., the deductive formu- 
lation and testing of hypotheses to 
predict the behavior which the theory 
indicates should be related to the 
stylistic variables. 

This review of the literature has 
been organized around the author's 
impression that what has gone on in 
the past has resulted primarily in de- 
scriptive information. The question 
proposed, and answered with progres- 
sively more rigor and precision, ap- 
pears to have been: “What is the ac- 
quiescent person like?" There may 
be those who would argue that this is 
an inappropriate question to ask. It 
is only a minor variation of the 
“What is. . . ?“ type of question 
which Muenzinger (1957) labeled as 
a "sterile exercise" in psychological 
research. How this argument is to be 
resolved will be left for the reader to 
decide for himself. The point to be 
made here is based on the assumption 
that the previously gathered descrip- 
tive data do represent a valuable con- 
tribution to the area of personality 
assessment. But, it is now time to 
shift into low gear and change course 
so as to proceed down the abstraction 
ladder in the direction of observable 
behavior. A question of major im- 
portance for future research is one 
stated in predictive form: “What will 
the acquiescent person do?" w 

At this point a short definition of 
terms is necessary to establish the 
proper Einstellung for the point to be 
made in the following section. The 
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problem of definition revolves around 
the usage of the term "observable be- 
havior." It is essential to distinguish 
behavior in situations especially de- 
vised to observe a subject's responses 
in the laboratory, or in field settings, 
from patterns of responses made to 
paper-and-pencil questionnaires or 
inventories. There is no argument, 
certainly, that the term observable 
behavior, broadly conceived, includes 
both activities. Perhaps a useful dis- 
tinction to some would be in terms of 
"psychometric versus nonpsycho- 
metric" situations. Yet, again, 
broadly defined, any measurement of 
behavior is a psychometric situation. 
There is a meaningful distinction be- 
tween the behavior involved in re- 
porting the distance traveled by a 
stationary light in an autokinetic 
conformity task, and the behavior 
involved in marking the agree cate- 
gory on an IBM answer sheet. Cer- 
tainly there are few who would not 
grant this distinction, or who would 
not further grant that the distinction 
is based on the methodological inde- 
pendence of the two situations. In 
the following paragraphs the term 
observable behavior is used to denote 
responses elicited in a laboratory task 
as distinct from those elicited by the 
standard psychometric instrument. 


SUGGESTED CONSIDERATIONS FOR 
FUTURE RESEARCH 

Measures of Personality Variables 

The first point to be made in this 
regard has already been alluded to. 
There is remarkable dearth of studies 
in this area which have attempted to 
study the relationship between re- 
sponse style measures and observable 
behavior measures of personality var- 
iables. The investigations of Crowne 
and Marlowe and their students area 
notable exception. In the area of ac- 
quiescence only Jackson has used an 
independent behavioral measure of 
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the trait he was studying along with 
response style. The reasons for the 
needed shift to behavioral tasks are 
clear. Of primary importance is the 
fact, well recognized by most, that 
paper-and-pencil personality scales 
are heavily loaded with RS influence. 
To the extent that this is true, it 
naturally leads to inflated correla- 
tions between an RS measure, A, and 
a personality scale, B. If item con- 
tent is considered of minor impor- 
tance in determining a scale score, as 
RS research generally assumes, then 
correlating A with B to show the re- 
lationship of two response style meas- 
ures is a reasonable procedure. But 
to correlate A with B and conclude a 
relationship between response style 
and the trait purportedly measured 
by the content of B is a logically un- 
justifiable procedure. The paper by 
Webster (1960) which was discussed 
above is a good example of this re- 
markably inadequate approach to 
hypothesis testing. Personality in- 
ventories and RS measures are, by 
nature of their similar construction, 
to say nothing of item overlap, highly 
contaminated methodologically. Be- 
cause of the generalized operation of 
RS variables, one cannot serve as an 
independent criterion for the other. 
When Frederiksen and Messick 
(1958) corrected their personality 
scale scores for RS, they found quite 
low relationships. Independently ob- 
served behavior is the only meaning- 
ful criterion measure of the personal- 
ity variables. 

Finally, inasmuch as the goal of 
psychological research is generally ac 
cepted to be the prediction of be 
havior, it is inadequate to stop shor : 
of that point. Admittedly laboratory 
conditions which provide the neces 
sary controls of relevant variables 
also produce an artificiality which 
makes the situation unlike the SU?" 
ject's real world. Yet, it is a step !? 
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the direction of the ultimate criterion, 
and one which justifies the added ex- 
pense and effort. 


Measures of RS Variables 


In the area of acquiescent respond- 
ing, several instruments and tech- 
niques have been proposed. They in- 
clude various kinds of content from 
aphorisms to statements of personal 
and social attitudes. Some instru- 
ments attempt to measure pure ac- 
quiescence by putting an individual 
in a situation where he is forced to 
respond to ambiguous, or essential- 
ly meaningless stimuli. Cronbach 
(1950) suggested that response tend- 
encies should be most apparent in 
situations where stimulus conditions 
are most uncertain. Berg and Rapa- 
port (1954) confirmed this expecta- 
tion by showing that consistent pref- 
erences for certain response options 
result when the subjects respond to 
an “unstructured questionnaire“ 
wherein they guess about the non- 
existent items the experimenter is 
supposedly reading to them. Bass 
(1956) also used this unstructured 
technique and compared it with his 
content-laden Social Acquiescence 
Scale. He found a correlation of .00. 

If acquiescent behavior is to be in- 
terpreted as the indiscriminant use of 
Yes, true, and agree options, irrespec- 
tive of item content, correlations 
should be high between content and 
noncontent measures of acquiescence. 
Or, if there are two types of acquies- 
cence as Bass has suggested, one 
would at least expect that various 
Noncontent measures would consti- 
e one of the types, and correlate 

ighly with one another. McGee 
( 962) has found correlations which 
"ggest that this is not true either. 
Ee this discussion has focused 
CER the measurement of acquies- 
VOR the point applies equally well 
Y response style one wishes to 


relate to central personality traits. 
Consequently, the course of future 
research on the personality correlates 
of response styles is clearly indicated. 
The basic assumption is still in need 
of verification. But first, two things 
must be demonstrated: that tech- 
niques are actually available for 
measuring a "pure" response style 
tendency, and that this response style 
variable can be used to predict be- 
havior in an independent situation on 
the basis of some theoretical inter- 
pretation of that variable. Until 
these two points are demonstrated 
there is no defensible evidence that 
test taking response style is related to 
underlying personality syndromes, or 
traits. 


SUMMARY 


The recent surge of interest in re- 
sponse style components of person- 
ality tests scores has led to a more 
specific interest in measures of re- 
sponse variables as predictors of un- 
derlying personality traits of the re- 
sponders. The research studies rele- 
vant to this question, most of which 
appeared in the literature since 1958, 
have been reviewed. The point was 
made that these investigations have 
provided meaningful abstract de- 
scriptions of the personalities of indi- 
viduals with certain response style 
tendencies, but little real defensible 
data to tie response styles to the 
criterion of independently measured 
behavior. Suggestions were made for 
designing future research efforts such 
that the data will lead to a prediction 
that the acquiescent individual will 
do something ina particular situation 
as well as merely say "yes" more 
times than he says "no" on the F 
Scale. Only with such data is it felt 
that an adequate criterion will exist 
for claiming à relationship. between 
response tendencies and basic person- 


ality traits. 
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Some studies involve only two 
groups and provide only one differ- 
ence to be tested for significance. 
Other studies involve several groups 
and provide many differences to be 
tested for significance. A question 
has arisen in the literature (Duncan, 
1955; Ryan, 1959; Tukey, 1949) as to 
how significance should be deter- 
mined when a number of tests are to 
be made in the same experiment. 
Ryan (1959) has performed a valua- 
ble function by pointing out that 
there are several ways of dealing with 
this problem. 

It would be possible to adopt a strat- 
egy that would hold errors constant 
per comparison, per hypothesis, per 
experiment, per group, or even per 
subject. The question is essentially: 
what is the appropriate unit in which 
to evaluate research? It is the thesis 
of this paper that the most defensible 
decision is to divide our work into 
separate tests of hypotheses and to 
hold constant the expected number of 
errors per hypothesis tested. 

The number of groups involved in 
the test of a single hypothesis may 
vary depending on the attitude of the 
experimenter and the nature of the 
hypothesis. Often an experiment 
determines the effects of several 
degrees of a measurable variable: in 
this case the hypothesis is usually 
that there is some relationship be- 
tween an independent and dependent 
variable. In this case differences 
between individual groups may be of 
little concern. For example, if 

length of food deprivation is varied at 
2-hour intervals from 2 to 24 hours, 
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it is the overall variability between 
groups that is of interest, not the 
difference between any particular 
pair of groups. A failure to find a 
difference between the 8-hour and 10- 
hour group would be of little impor- 
tance. In other cases several groups 
may be run that do not represent 
points on a measurable dimension 
and in such cases the difference be- 
tween each group and every other 
group may be viewed as a separate 
hypothesis. For example, if the re- 
sults of five different therapies are 
compared, the significance or non- 
significance of the difference between 
any two groups would probably be 
considered important. In this second 
case there would be more hypoth- 
eses but less data relevant to each 
one. If several variables are studied 
in a single experiment the significance 
of the effect of each variable and each 
interaction may be tested as a 
separate hypothesis. The practice of 
holding errors constant per hypoth- 
esis tested seems to be by far the 
most common in the literature: the F 
test is typically employed when the 
performance of several groups is sub- 
sumed under one hypothesis, and the 
t test is typically used to test differ- 
ences between pairs of groups when 
each pair is construed as bearing on à 
separate hypothesis. Many, if not 
most, researchers are not even aware 
of the various special statistics that 
have been devised for the purpose o 
using some unit other than the 
hypothesis as the basis for error rate. 

It is necessary to recognize, how- 
ever, that all discussions in the 
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literature recommend some unit other 
than the hypothesis as the basis for 
determining error rates. Ryan 
(1959) and Tukey (1953 unpub- 
lished), for example, favor the ex- 
periment as the preferred unit. The 
only dissenter to this general ap- 
proach seems to be Duncan (1955) 
who favors what is essentially a com- 
promise position. The purpose of this 
paper is to consider the pros and 
cons of the per-experiment versus the 
per-hypothesis approach. An at- 
tempt is made to make clear that 
Some inconsistency is involved in 
either case and that a consequence 
of this fact is that several of the 
arguments offered in favor of the per- 
experiment strategy are in fact offset 
by parallel, equally logical argu- 
ments, in favor of the per-hypothesis 
strategy. It is pointed out below that 
While it is impossible to prefer one 
approach to the other on logical 
grounds, other considerations ac- 
tually favor the per-hypothesis ap- 
proach. Ryan (1959) and Tukey 
(1953 unpublished) actually speak of 
a per-comparison (rather than a per- 
hypothesis) approach as the possible 
alternative to the per-experiment 
approach. Although the two may 
Seem to be similar, the per-hypothesis 
approach is different from the per- 
comparison in that any number of 
Comparisons may be considered in 
testing one hypothesis, however, the 
oe presented in relation to 
€r-comparison strategy apply in 
7 the same way 1% per- 
ypothesis approach. 
h As Ryan makes clear, if a per- 
D ea strategy is used, the same 
1905 er of errors will be expected in 
‘ok EAR experiments, each of which 
ad me hypothesis, as will be ex- 
92 in a large experiment that 
$ S 100 hypotheses (Ryan, 1959, 
Ms 30-34). Ryan maintains that in- 
Pendence of the tests or lack of it 


makes no difference: “The error rates 
per comparison and per experiment 
are completely unaffected by inde- 
pendence or lack of it" (Ryan, 1959, 
p. 34). Obviously if the error rate per 
hypothesis is held constant, the error 
rate per experiment will vary, de- 
pending on the size of the experi- 
ment. On the other hand, if the error 
rate per experiment is held constant, 
the error rate per comparison will 
vary, depending again on the size of 
the experiment. Since inconsistency 
is involved in either case a choice on 
purely logical grounds does not seem 
possible. If the implications of this 
fact are followed consistently, several 
of the arguments in favor of the per- 
experiment solution become meaning- 
less. 

Ryan (1959) and Tukey (1953 un- 
published) both argue that a per- 
hypothesis strategy implicitly givesa 
person license to make relatively 
more errors per experiment merely 
because he has been industrious in 
running many groups. Although this 
argument seems quite irrelevant to 
the issue, it is only fair to note the 
other side of the question. The per- 
experiment strategy implicitly gives 
a person license to make relatively 
more errors per hypothesis, merely 
because he has been lazy, as evi- 
denced by the running of few groups! 
It is hard to see how the first argu- 
ment can be considered more com- 
pelling than the second. 

Ryan (1959) also argues that a per- 
hypothesis strategy, by favoring the 
person who is industrious, as evi- 
denced by the running of many 
groups, may lead people who run 
many subjects in a two-group ex- 
periment to demand the privilege of 
using a higher error rate because 
they too have been industrious, as 
evidenced by the running of many 
subjects. While this argument seems 
a little too artificial to deserve con- 
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sideration, it is once again easy to 
point out the parallel counter-argu- 
ment. The per-experiment solution, 
by favoring the person who is lazy, as 
evidenced by the running of few 
groups, might lead those who run few 
subjects in a two-group experiment to 
demand the privilege of using a 
higher error rate because they too 
have been lazy! Once again it is hard 
to argue that the possible conse- 
quences of the per-hypothesis ap- 
proach are worse than the possible 
consequences of the per-experiment 
approach. 
Some of the comments in the litera- 
ture (e.g., Ryan, 1959, pp. 35-37) 
may suggest that the use of a per- 
hypothesis strategy necessarily re- 
sults in an inordinate amount of error 
or at least in more errors than a per- 
experiment strategy. Such a con- 
clusion would be completely false. It 
is true that if a per-hypothesis error 
rate is employed there will be rela- 
lively more errors per experiment in 
large experiments, but it is also true 
that if a per-experiment error rate is 
employed there will be relatively more 
errors per hypothesis in small ex- 
periments. The total expected num- 
ber of errors can be controlled 
equally well no matter in what unit 
results of research are measured. In- 
sistence on fewer errors per-experi- 
ment would decrease total errors to 
be sure, but insistence on fewer errors 
per-hypothesis would decrease total 
errors equally well. Ryan actually 
concedes this point at one place, but 
apparently fails to recognize its im- 
plications (Ryan, 1959, pp. 37-38). 
Unless one wishes to argue that an 
error does more damage merely be- 
cause it occurs in a large experiment, 
it must be concluded once more that 
there is no logical basis on which to 
choose between the different strate- 
gies. 
The writer firmly agrees with those 
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who think a more rigorous control of 
errors is called for; however, he sug- 
gests that the most effective way for 
workers to achieve this is to hold the 
expected error rate constant at .001 
per hypothesis. Suppose one person 
publishes at the .05 level per experi- 
ment and a second publishes at the 
.001 level per hypothesis. Assuming 
that the second person's experiments 
test less than 50 hypotheses on the 
average, he will make fewer errors 
both per experiment and per hy- 
pothesis than will the first person. 
Clearly an experimenter can be as 
rigorous as he wishes and «till use the 
hypothesis as his research «nit. 
Another type of consideration re- 
lates to the effect that each strategy 
might have on the behavior of re- 
searchers as they design, carry out, 
and write up experiments. It has 
been argued (Ryan, 1959, p. 36) that 
a per-hypothesis type approach en- 
courages investigators to include 
"irrelevant" variables in their studies 
merely to increase their chances of 
obtaining one or more "'significant" 
findings to publish. Surely such 
motivation is deplorable. However, 
it is doubtful that many researchers 
will deliberately resort to such 
tactics, and surely editors will be re- 
luctant to accept implausible false 
positives no matter what statisti 
techniques are used. Furthermore 
the line between adding irrelevant 
variables and exploring new possi- 
bilities is rather subtle and it is not 
at all certain that psychology would 
not profit from some additional blind 
seeking for relationships. It is nec- 
essary to insist on looking at both 
sides of the picture. What sort 
pressures does the per-experiment 
procedure apply to the researcher? 
It seems likely that, for better of 
worse, most experimenters desig" 
studies to demonstrate relationships 
they believe to exist. Their desire 1$ 
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to obtain data that will support their 
hypotheses and compel others to 
accept them. Very generally it can 
be assumed that there is often a 
choice between testing a number of 
hypotheses in different experiments 
by running only the two groups ex- 
pected to be most extreme versus 
testing several hypotheses in one ex- 
periment by running several groups 
to determine the effects of each vari- 
able. 

The latter, more extensive type of 
study, is greatly to be preferred since 
it consumes less journal space per 
hypothesis, it allows for the evalua- 
tion of interaction effects, and it 
gives some idea of the shape of rela- 
tionships. The per-experiment ap- 
proach seem to discourage extensive 
studies because the more extensive 
the study the less the likelihood of 
being able to acceptany given hypoth- 
esis as correct. In other words if a 
per-experiment strategy is used, the 
smaller the pieces in which one can 
publish, the greater his chances of 
having significant findings to re- 
port. When a per-hypothesis strategy 


is followed this additional encourage- 


ment to publish in small pieces is not 
Present. The literature is currently 
cluttered with small one-shot studies 
and there is a relative dearth of well 
Conceived, intensive investigations. 
Certainly all angles should be con- 
sidered before a strategy is advocated 
that might intensify this unfortunate 
tendency. Apparently either the per- 
experiment or the per-hypothesis 
Strategy might have ill effects on cer- 
tain researchers, but once again it is 
hard to see the arguments in favor of 
the per-experiment approach as more 
Compelling than those favoring the 
Per-hypothesis approach. 

In addition it can be pointed out 
that there are strong advantages to 
the per-hypothesis solution. The 

ie question is, what is the most 
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pr hypothesis 
as the unit and this paper maintains 
that this is the correct choice. It 
seems that the hypothesis is psycho- 
logically the more logical unit. This 
writer, at least, would prefer to be 
confronted with a great array of Ánd- 
ings, all of which ( speak- 
ing) have a comparable probability of 
being correct, rather than to be con- 
fronted with a number of conclusions 
each of which can be accepted with 
more or less confidence depending on 
the size of the experiment from 
which they were derived. 

Another ras svann X m 

-hypot approach 
iat it requires no additional learn- 
ing on the part of researchers. Ob- 
viously the more complicated statis- 
tics become the more time it will take 
to learn to use them and the less time 
will be available for research itself. It 
seems foolish for researchers to ac- 
cept additional statistical complica- 
tions unless there are telling reasons 
for doing so. It might also be added 
that it is practically impossible for a 
statistically naive researcher to 
abandon the traditional per-hypoth- 
esis techniques because statisti- 
cians have not yet agreed upon any 
other strategy or even on how best to 
achieve the various alternatives that 
have been advocated. Duncan 
(1955) mentions nine different solu- 
tions to the problem of multiple com- 
parisons and comments that, “Un- 
fortunately, these tests vary con- 
siderably and it is difficult for the 
user to decide which one to choose 
for any given problem" (p. 2). One 
purpose of Duncan's article was to 
propose still another solution: It has 
not received general acceptance 
(Ryan, 1959) and it seems apparent 
that statisticians have no generally 
agreed upon alternative to suggest as 
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a possible replacement for the per- 
hypothesis approach. 

It must be concluded that the 
arguments in favor of theper-hypoth- 
esis strategy are more numerous 
and more compelling than those in 
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favor of the per-experiment solution. 
Therefore the less effortful per- 
hypothesis approach should be con- 
tinued indefinitely unless valid argu- 
ments are presented in favor of a 
different strategy. 
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THE EXPERIMENT AS THE UNIT FOR COMPUTING 
RATES OF ERROR 


THOMAS A. RYAN 
Cornell University 


I am very glad that Wilson 
(1962) has addressed himself to the 
basic issues involved in multiple com- 
parisons—issues partly of logic and 
partly of research strategy. There is 
a very real dilemma involved, and 
one which needs to be brought into 
the open even if we cannot reach a 
single solution. In my discussion of 
the problem (Ryan, 1959b), I be- 
lieved that the balance of the argu- 
ments favored the error rates based 
upon the experiment as the unit, and 
I stated that conclusion in its strong- 
est form. Perhaps the statement was 
one-sided, as Wilson believes, but I 
hoped that this would bring the issue 
more clearly to the readers than a 
less positive statement. Wilson has 
chosen the other horn of the dilemma 
and has done a service in stating the 
case for his choice so clearly. Many 
of my colleagues had previously ex- 
pressed unhappiness with my conclu- 
sions, not on logical grounds, but 
because experiment-based error rates 
made life more difficult for the re- 
searcher, who must find bigger /'s for 
Significance. 

Even a casual examination of the 
current journals will show that this is 
an issue which crops up in a very high 
i ue of research reports, 
1 5 usually unrecognized by the 
COR of the reports. Wilson's 
es arguments (apart from his 
9 more stringent significance 
d tend to support the status 
51 ut it would be a pity if his posi- 
SU b dM adopted simply out of 
"x a without careful weighing of 

0 vantages and disadvantages. 

orough examination of the issues 


might even lead us to abandon sig- 
nificance testing in favor of some 
more useful form of statistical treat- 
ment, although I do not yet see what 
might replace significance tests or 
confidence limits. 

The issues must be met and settled 
by psychologists themselves, or by 
other „consumers“ of statistical 
method. The statisticians can tell us 
how to accomplish what we want to 
do, but we must decide what we want 
to do in terms of overall research 
strategy. If we could quantify the 
costs of doing research with various 
experimental designs, the "earnings" 
due to correct conclusions and the 
“losses” due to Type I and Type II 
errors, the whole problem could be 
solved mathematically. Since, clearly, 
this can be done only in such limited 
and artificial situations that it could 
not provide a general procedure, we 
are forced to choose our procedures 
on the basis of broad and qualitative 
arguments.“ This is what we do when 
we choose a significance level for a 
single comparison, since we are 
balancing qualitatively the risks of 
Type I error against the risks of 
Type II error. The same kind of 
qualitative balancing is necessary 
when we consider the issue of error 
rates per hypothesis versus error 
rates per experiment. Unfortunately, 
however, the balancing of risks be- 
comes much more complex than it is 


1 Tukey (1960) has recently made a dis- 
tinction between “decisions” and “conclu- 
sions," the latter being of more relevance to 
scientific work. He also argues that the theory 
of statistical decision is not appropriate to the 
testing of conclusions. 
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for the simple, isolated comparison. 

I have asked for the opportunity to 
comment on Wilson's (1962) paper, 
not because I want, or expect, to 
prove that he is wrong, but because 
some of the implications of his argu- 
ment need to be pursued further. His 
conclusion may be the best one, but I 
am not yet convinced. 

Let us first stipulate (a) an im- 
portant point of agreement and (b) 
one issue which can profitably be left 
for a separate discussion: 

I would agree that there are many 
cases where overall analysis of vari- 
ance is more appropriate than multi- 
ple comparisons of individual means. 
These are cases which are essentially 
problems of regression, which I in- 
tentionally left out of my earlier 
analysis of the problem of multiple 
comparisons (see Ryan, 19592, p. 
396). Even here the issue of error 
rates may rear its head if we wish to 
test separately for linear, quadratic, 
and higher order components, or if 
we wish to state where the maximum 
or minimum falls. This latter prob- 
lem becomes closely similar to the 
problem of multiple comparisons, so 
I shall not argue it separately. 

I shall leave aside the question of 
whether the various F tests in a com- 
plex analysis of variance should be 
treated with an error rate per F test 
(per hypothesis), or on the basis of 
an overall rate for the whole experi- 
ment. Here I only wish to make clear 
that the point of view I previously 
expressed on this problem (Ryan, 
1959b, p. 44) is not to be attributed 
to Tukey. Tukey prefers to follow 
current standard practice in complex 

analysis of variance, allocating an 
error rate to each F separately. If 
one of the variates is subjected to 
multiple comparisons, he would allow 
this family of comparisons the same 
error rate familywise that would 
otherwise have been allocated to the 
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single F test for this variate. When I 
stated that thesamearguments which 
lead to a familywise control of error 
could be used to support an overall 
control of error for the whole multi- 
variate experiment, this was my own 
conclusion, which Tukey did not ac- 
cept? For the present let us leave 
the multivariate problem out of the 
discussion as too complex to deal 
with until we have settled the more 
basic question of how to deal with the 
univariate experiment. 

Turning now to what appears to 
be the principal point of difference, 
Wilson objects to the argument that 
the control of errors per hypothesis 
gives the experimenter more chance 
of finding some significart differences 
merely by being more diligent and 
studying more different conditions. I 
will admit that this is a question of 
values (but not of morals). There 
are several questions of values in- 
volved throughout significance test- 
ing; e.g., choosing to work at the .01 
level instead of the .05 level is also 
a question of value, or how important 
we consider erroneous conclusions to 
be. Wilson is justified in questioning 
my point because the problem was 
incompletely analyzed in my earlier 
discussion. At the time I was merely 
trying to express the rather vague 
notion that obtaining significant re- 
sults should not depend solely upon 
the persistence or diligence of the 
experimenter. These are admirable 
qualities, but a statement of signifi- 
cance should also bear some relation 
to the facts of nature, as well as to the 
diligence of the experimenter in seek- 
ing out these facts. 

It must be emphasized that error 
rates refer to what happens when the 
null hypothesis is true. If the expert 
menter is so perspicacious or sO lucky 


2J. W. Tukey, personal communication, 
1956. 
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as to find real effects upon behavior, 
the Type I error rates no longer apply 
and considerations of power enter the 
picture. If, on the other hand, he is so 
unfortunate as to waste his energy 
upon a real null situation, there is no 
reason to allow him to make some 
mistakes as a consolation prize. 

I will agree with Wilson, however, 
that this whole aspect of the relation 
of error rates to experimental in- 
vestment needs more exact analysis. 
Unfortunately there are so many 
facets to be considered at once that 
we have to oversimplify to make any 
sense of the problem. One approach 
to the problem is to try to hold the 
factor of experimental effort or cost 
constant while comparing different 
experimental designs. One approxi- 
mation would be to consider the total 
number of observations as a measure 
of the work done in the experiment. 
This might be a reasonable assump- 
tion on the average, since we are 
concerned with the choice of designs 
to be used with the same kind of 
measures. Suppose for example, that 
Experimenter A studies two condi- 
tions with 100 observations per condi- 
tion. Experimenter B also makes 200 
observations altogether but spreads 
them over 10 different conditions of 
the variable, and compares each 
mean with each of the others. 
tests 1 null hypothesis, while B tests 
45 hypotheses with the same total 
amount of data. If the error rate is 
controlled per hypothesis, B can be 
expected to make .45 errors while A 
is making .01 errors, but their error 
rates will be equalized if the experi- 
ment is used as the base for comput- 
ing error rate. Thus it seems that the 
rate of error per experiment should be 
. if we wish to equalize the 
Ai of Type I error for a given 
Pnl of experimental data spread 

r varying numbers of groups. 

Now consider Experimenter C who 


/ 


also studies 10 conditions, but col- 
lects a greater amount of data so that 
the additional groups represent addi- 
tional experimental effort. He is, as 
Wilson points out, not allowed any 
more error on the per experiment 
basis. But this is also true if error rate 
is computed. per hypothesis. Specifi- 
cally, suppose that B makes 200 
observations spread over 10 condi- 
tions, while C makes 1,000 observa- 
tions on the same 10 conditions. 
Both methods of computing error 
rate would treat the two experi- 
menters alike. One would allow both 
to make .45 errors the other would 
allow both to make .01 errors. 

C is not allowed any more Type I 
errors than B for his extra effort by 
any of the methods of computing rate 
of error, but C does gain in power 
from the extra observations. This is 
consistent with current practice, in 
that the error rate for A's single 
comparison would not be changed if 
he collected more data for the two 
conditions. In short, controlling 
errors per experiment holds the 
amount of error constant for a fixed 
amount of experimental effort 
whether it is devoted to a single pair 
of conditions or many different condi- 
tions. Controlling the rate of error 
per hypothesis allows the error rate 
to increase as the number of groups 
increases, even if the same total 
amount of experimental effort is 
spread thinly over many groups. For 
both methods, additional observa- 
tions, without a change in the re- 
search design, are used to increase the 
power of the experiment but do not 
change the rate of Type I error. 

The above argument points up the 
fact that there is an arbitrary deci- 
sion involved in current practice even 
with single comparisons. It has been 
decided that power shall vary with 
number of observations, but that 
rate of Type I error shall not. This 
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presumably is the historical result of 
the concept of significance develop- 
ing before the concept of power. 
Actually, of course, the concept of 
power derives from decision theory in 
which the rates of Type I and Type 
II error would both be variable and 
adjusted in terms of the costs of the 
two types of error. Instead, however, 
the concept of power has been tacked 
on as an adjunct to the significance 
test, but not controlled directly be- 
cause of our ignorance of the con- 
sequences of error. This is one of the 
aspects of present practice in signifi- 
cance testing which needs more 
thorough examination. For example, 
do we really want extra effort in re- 
search to be devoted to the detection 
of smaller and smaller differences? 
Meanwhile, if we accept current 
practice as the appropriate approach 
to single comparisons, the comparable 
solution to multiple comparisons 
would be to control the error rate per 
experiment. 

There is another argument in favor 
of basing error rate upon the single 
hypothesis, an argument which Wil- 
son does not mention directly but is 
related to his fear of discouraging 
large-scale experiments if we adopt 
the error rate per experiment. This is 
the point that even the experimenter 
who tests a single hypothesis in an 
experiment is one of many who may 
be working upon the problem over 
the years, or he may be doing one 
experiment in a series of his own. 
Yet (the argument continues) it is 
accepted practice for him to test this 
single hypothesis as though it were 
isolated from all of the studies carried 
out by him or others in the past. If 
he is allowed to do this, why should 
the experimenter who tests several 

hypotheses in the same paper be 
penalized by requiring him to limit 
his errors for the total experiment 
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rather than for each hypothesis 
separately? 

This isa very powerful argument if 
we accept current practice as appro- 
priate, and if we agree that current 
practice is as described. I would ques- 
tion both of these assumptions, how- 
ever. An experimenter never con- 
siders his results in isolation from the 
rest of the available data in the field. 
One result which is out of line with 
other findings in the field is likely to 
be regarded with suspicion even ifit 
is technically significant, and further 
replication will probably be called for. 
Even though this is not a quantified 
or explicit procedure it is in the same 
spirit as controlling errors for the 
total experiment in multiple compari- 
sons. We are handicapped in our 
knowledge of the total experimental 
background because of the failure to 
publish many negative findings and 
the consequent bias in the results 
available to us (see Sterling, 1959). 
Nevertheless we do, and should, try 
to take account of total mass of in- 
formation available to us in inter- 
preting any specific experimental 
result. 

Wilson's (1962) fears that experi- 
menters will be discouraged from do- 
ing large experiments with many 
different conditions if we expect them 
to limit the total errors for the whole 
experiment. Yet why should they be 
discouraged? They are doing these 
experiments because they want to 
find real effects, not because they 
want to report Type I errors. Wilson 
seems to believe that I advocated 
error rates per experiment because I 
wanted to increase the stringency o 
our standards of significance. Ad- 
vocating the experiment as the 
proper base for computing error rates 
does not imply that we should set up 
more stringent criteria of significance 
than are now customary; this is a 


separate and independent question. 
]t is true that a .01 level of sig- 
nificance experimentwise or per ex- 
periment means a much lower proba- 
bility level for individual comparisons 
within the experiment if many com- 
parisons are made. We do not have 
to work at the .01 level experiment- 
wise, however. The problem is not 
what particular probability should be 
chosen, but which method of com- 
puting the rate is comparable from 
one research to another. I have 
merely argued that the rates per ex- 
periment or experimentwise, whether 
01 or .90, provide greater compara- 
bility from one research to another. 

It happens that I, like Wilson, do 
believe that we should use more 
stringent criteria of significance if 
we use significance tests at all. But 
he is quite correct in pointing out 
that greater stringency could be 
achieved by lowering the probability 
levels for errors per hypothesis as 
well as controlling errors per experi- 
ment. Consequently, the choice of 
error rates is logically irrelevant to 
the issue of stringency. My own rea- 
Sons for supporting greater strin- 
gency are based upon the belief that 
ype I errors are more dangerous in 
the present state of development of 
Psychology than are Type II errors. 
In other words, I believe that it is 
ess important if we miss some very 
small effect of a variable, than it is to 
Claim that the variable has an effect 
of unspecified magnitude) which 


Ryan, T, A, Comments on orthogonal com- 
$96] Psychol. Bull, 1959, 56, 394- 
Ran, T. HA Multiple comparisons in 
E: 104 e Psychol. Bull., 1959, 
o T. D. Publication decisions and 
a r possible effects on inferences drawn 
Jp tests of significance—or vice versa. J. 
v. Statist, Ass., 1959, 54, 30-34. 
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does not actually exist at all. This is 
however, another problem of many 
facets which cannot be threshed out 
here.“ 

To summarize, Wilson (1962) has 
presented some very strong argu- 
ments for controlling error rates per 
hypothesis instead of for the whole 
experiment. It is a service to have 
this side of the issue presented so 
clearly, and it is possible that he is 
right. There are, however, strong 
counterarguments which I have tried 
to present, and which still weigh 
heavily enough to convince me that 
we should control the error rates per 
experiment. To me, the strongest 
argument is that controlling the rate 
of error per hypothesis permits wide 
variation in the total amount of error 
expected for different experimental 
designs which involve the same total 
number of observations. 

The issue is by no means settled, 
however. There are many factors 
which must be weighed against each 
other, and there are probably some 
considerations that have not yet been 
dealt with. An adequate solution of 
the problem might even lead to an 
abandonment of significance testing 
in favor of some other method of deal- 
ing with the effects of sampling error 
which would not create the dilemma 
with which we are now faced. 


3 One especially important problem is what 
we shall do with negative results (see Sterling, 


1959; Tullock, 1959). 
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AN EXACT MULTINOMIAL ONE-SAMPLE 
TEST OF SIGNIFICANCE 


ALPHONSE CHAPANIS 
Johns Hopkins University’ 


Many psychological studies yield 
nominal data—numbers of subjects, 
objects, or responses—distributed in- 
to two or more mutually-exclusive 
categories, With data of this type the 

experimenter, and the reader, usually 
want to know if the observed fre- 
quencies differ significantly from 
what one would expect on the basis 
of chance. Chi square can be used for 
making such a test provided that the 
numbers of observed frequencies ex- 
ceed certain minimum requirements. 


size but it can only be applied to data 
distributed into two categories. This 
article shows how the multinomial 
distribution can be modified and used 
as an exact test of significance for sam- 
ples of ERR IM | for data distrib- 
uted into any number of categories. 
Although the test described here fol- 
lows closely those given by Smith and 
Duncan (1945, pp. 308-326) and 
Tate and Clelland (1957, pp. 35-36), 
it differs from both of them in certain 
important respects. 
DESCRIPTION OF THE METHOD 

The probability that a sample of 
data will yield the frequencies ni, 
"s "a'ene distributed into k 
categories is given by the multinomial 
distribution: 


1 This paper was prepared while the author 
was on E (1959-60) with the Office of 
Naval Research in London, England. 
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where the p's are the proportions: 

which the characters 1,2,3, * 
occur in the population. 

When we use the multi 
distribution as a test of signi 
we assume that the null hy 
holds, that is, that the p's are 
and that each of them is eq 
1/k, where & is the number of 
gories. If, in addition, we let 
Tnm +++ +m then Eq 
1 becomes: 


N! G 
mingling i++ smi \k 


One further addition is still 
If the null hypothesis h 


any outcome. To take a 
example this means that all 
following six outcomes are € 
alent: 

4 A's, 3 B's, 1C 


4 A's, 1 B, 3 C's 
34's 4 B's, 1C 
3 A's, 1 B, 4 C's 
14, 3 B's, 4 C's 
1 A, 4 B's, 3 C's 


Adding this refinement to 
tion 2 gives us: 
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where: 


the probability of obtaining 
any permutation of my, "p 


na n frequencies 
the total number of observa- 
tions (individuals, objects, or 


responses) 
k= the total number of categories 
into which the N observations 
are distributed 
i any integer Sk 
t,» number of ties of size i among 
the & frequencies 
j=any one category 
„the number of observations in 
the jth category 


Note that Equation 3 without the 
(D term tells us the number of ways 
in which any particular outcome 
(such as 4,3,1) can occur. Note also 
that for x 2, Equation 3 reduces to 
the form of the binomial distribution 
which would be used for a test of this 
kind. 

Equation 3 merely gives us the 
probability that M observations will, 
by chance, be distributed into & 
categories with any particular set of 
frequencies m, na, my - - - m. To use 
Equation 3 asa test of significance we 
need to add to this the probabilities 
9f all those outcomes which are even 
more deviant than the one observed. 


ILLUSTRATION OF THE METHOD 


To illustrate the application of this 
formula I shall use some data col- 
lected by Deininger (1960). In one 
part of his experiment Deininger had 
Subjects use keysets in which the keys 
ad maximum displacements of xu, 
+ and i inch. At the conclusion of 
àn unspecified number of trials, 12 
Subjects voted for the keyset they 
"Xed least: eight disliked the zu- 
inch, one the Finch, and three the 
inch. The author concludes that 
the smallest displacement appears 
Controversial, the largest unpopular 


deviant as 8,3,1 by chance? 

To apply Equation 3 note that 
N12, 8. m3, ml, 1-3, 
4-0, and , 0. Inserting these 
identities into Formula 3 gives us: 

Inn! 12! 
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The simplest way to illustrate the 
full computation of the probability 
we want is to list all possible out- 
comes and the number of in 
which each outcome can occur 

the term (Yu appears as a constant 
we can disregard it for — a be- 
ing). Table 1 gives these data. 

Note that we have a check oq these 
computations since the total for 
Table 1 is equal to 3". 

Before continuing we need to con- 
sider what we mean by outcomes 
"even more deviant than“ 8,3,1. 
What we mean are all those out- 
comes which have an even smaller 
probability of occurring. Table 2 
lists these in order. The total for 
Table 2 is 37,431 and this value multi- 
plied by (J), or Dow Py Mg 

ives us a probability d 
To summarize, if the null hypoth- 
esis is correct we could expect an 
outcome as deviant as 8,3,1 to occur 
about 7 times in 100. According to the 
usual conventions we would therefore 
conclude that this outcome is not 
statistically significant. 
Tue Cask oF TIES 

he example given above is con- 
3 it is small enough for 
us to see all the essential computa- 
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tions compactly. It does not, how- 
ever, illustrate one nuance of Equa- 
tion 3, namely, what happens when 
some categories have tied observa- 
tions. In another part of his experi- 
ment Deininger (1960) had 15 sub- 
jects use five different keysets—call 
them A, B, C, D, and E—which 
differed in several ways. At the con- 
clusion of an unspecified number of 
trials each subject voted for the key- 
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set he liked best. The results were 2 
votes for A, 1 for B, 4 for C, 6 for D, 
and 2 for E. Now we want to test the 
significance of the outcome 2,1,4,6,2. 
The novel feature of these data is 
the two 2s. . 

In this case N=15, »1—2, "1-1, 
13-4, n4—6, ny—2, k—5, ta=1 (for 
nı and ns), ts=0, t= 0, and ts — 0. In- 
serting these values into Equation 3 
gives us: 


15! 1\15 
P= 
( + 227 [KO + 031] [(O + 1)4!][ + 1)5!] suu (s) 
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TABLE 1 


ALL PossIBLE OUTCOMES WHEN 12 
INDIVIDUALS ARE DISTRIBUTED INTO 
3 CATEGORIES AND THE NUMBER 
OF WAYS EACH OUTCOME 
Can Occux 


151 /1\ 
emis) 


SN 151 1 
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Number of ways the 
outcome can occur 


3 
72 

396 
396 
1,320 
3,960 
2,970 
11,880 
8,910 
4,752 
23,760 
47,520 
2,772 
33,264 
83,160 
55,440 
49,896 
166,320 
34,650 
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Total 531,441 


The p of 0.037 is, of course, merely 
the probability of getting exactly an 
outcome of 6,4,2,2,1 or some per- 
mutation of this outcome. The 
probability of an outcome as deviant 
as 6,4,2,2,1 (computed in a manner 
analogous to that shown in Table 2) 


TABLE 2 


OUTCOMES IN ORDER OF INCREASING 
LIKELIHOOD UP TO AND INCLUDING 
THE OUTCOME 8,3,1 
MEVS Se ee 


Number of ways the 

Outcome outcome can occur 
12, 0, 0 3 
11, 1, 0 72 
10, 2, 0 396 
10, 1, 1 396 
9, 3, 0 1,320 
6, 6 0 2,772 
8, 4, 0 2,970 
9 A 3,960 
AR 4,752 
8, 2, 9 8,910 
87335 1 11,880 
Total 37,431 
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is 0.49. It has, in short, no statistical 
significance whatsoever. 


A COMPARISON OF THE EXACT 
MULTINOMIAL TEST WITH 
Cut SQUARE 


As noted above, when k=2, the 
exact multinomial test given by 
Equation 3 reduces to the formula 
which would be used for an exact 
binomial test. It is of interest, how- 
ever, to compare the outcomes of the 
exact test given in this article with 
those of the chi square approximation 
commonly used for this purpose. 
Table 3 shows all possible outcomes 
when 12 individuals are distributed 
into three categories (from Table 1) 
and the exact probabilities of obtain- 
ing outcomes as least as deviant as 


TABLE 3 


Exact AND CH SQUARE PROBABILITIES 
OF Every POSSIBLE OUTCOME WHEN 12 
INDIVIDUALS ARE DISTRIBUTED INTO 
3 CATEGORIES 


Chi square 

8 Exact probability 
utcome| proba- 

bility Y i 

rected continuity 

— — 
155 15 0 0.000006 | 0.00001 | 0.00003 
ioi 0 | 0.000141 | 0.00010 | 0.00030 
Te 0 | 0.00163 | 0.00091 | 0.00104 
Sub 1|0.00163 | 0.00117 | 0.00248 
5 6 0 0.00412 | 0.00525 | 0.00674 
ee 0 | 0.00933 | 0.0498 | 0.0725 
99 25 1 0.0149 | 0.0183 | 0.0267 
xd 0.0224 | 0.00866 | 0.0126 
n 0.0313 | 0.0388 | 0.0440 
934 0.0481 | 0.0498 | 0.0725 
75 0.0704 | 0.0388 | 0.0440 
SUME 0.115 0.106 0.135 
2D 0.178 0.174 0.253 
JU 4 | 0.243 1.000 1.000 
rae ? 0.332 0.174 0.253 
. 0.472 0.606 
set 0.472 0.606 
. 0.368 0.417 
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those listed. For comparison, the 
third column of Table 3 shows the 
corresponding probabilities! com- 
puted by the chi square formula. 
One reason why chi square probabil- 
ities often do not agree with those 
calculated by exact tests is that chi 
square uses a continuous distribu- 
tion to approximate discrete ones. 
To compensate for the errors in- 
volved in this approximation statis- 
ticians often recommend applying a 
correction for continuity. Although 
corrections for continuity tend to 
overcompensate a little they do 
usually bring chi square probabilities 
into closer agreement with their true 
values. In the fourth column of Table 
3 the chi square probabilities have 
been corrected for continuity accord- 
ing to the method recommended by 
Cochran (1952). 

Table 3 shows some striking dis- 
crepancies between the chi square 
probabilities, both uncorrected and 
corrected, and those resulting from 
the exact test. Note especially the 
number of discrepancies in the critical 
areas around the 1 and 5% points. 
The outcome 6,6,0, for example, is 
significant at the 1% level by the ex- 
act test, scarcely significant at the 
5% level by the uncorrected chi 
square test, and not significant at 
the 5% level by the corrected chi 
square test. Similar large discrep- 
ancies occur for the outcome 9,2,1; 

; and 8,3,1. 
Se uM arde cDuncan. (^ (10a) 
assumed, as the chi square test does, 
that the probability of any occur- 
rence is proportional to the evenness 
of the distribution of the N observa- 
tions in the & categories. For this 
reason they stated that zones of re- 


2 These values were obtained, by linear 
interpolation when necessary, from the 
Pearson-Hartley (1956) tables. 
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jection, and so the statistical sig- 
nificance of any outcome, would be 
proportional to D?, where, in my 
notation, 


For the very small example they gave 
(N=5, k=3) this happened to be 
true, but it is certainly not true in 
general. The outcomes 6,6,0 and 
8,2,2 (7,5,0 and 8,3,1; 6,5,1 and 
7,3,2; and 5,5,2 and 6,3,3) have 
identical values of D? but markedly 
different probabilities of occurrence 
by the multinomial distribution. The 
symmetry implicit in D? is, in fact, 
one of the reasons why the chi 
square test yields probabilities which 
differ so markedly from the true ones 
(Table 3). 

However, the real source of the 
discrepancies between the two kinds 
of tests lies even deeper than this. 
Although the formula for chi square 
is derived from that of the multi- 
nomial distribution (for example, 
Kendall, 1947), at three separate 
points the derivation makes use of 
approximations which are valid only 
for large Ns. Table 3 shows that the 
cumulative effect of the errors in 
these approximations may be con- 
siderable when chi square is applied 
to data with small z's. 


ALPHONSE CHAPA NIS 


A DISADVANTAGE OF THE 
Exact TEST 


Perhaps the chief disadvantage of 
the exact test described here is that 
it is laborious to calculate and very 
quickly becomes prohibitively diffi- 
cult to apply when JN or k become 
large. The first example given in this 
paper is relatively straightforward 
and not too tedious. The second ex- 
ample (with 15 individuals dis- 
tributed into five categories), how- 
ever, required the computation of 84 
separate outcomes with a total of 
30,517,578,125 ways in which the out- 
comes could occur. This problem is 
almost a little too big for a desk 
calculator and a little too small for a 
digital computer. 


SUMMARY 


The extact multinomial test de- 
scribed in this article can be used to 
test the significance of variations in 
the numbers of observations distrib- 
uted into two or more mutually- 
exclusive categories. When there are 
only two categories the test reduces 
to the binomial test. The test is vali 
for samples of any size but it quickly 
becomes prohibitively difficult to 
apply as the total number of observa- 
tions or the number of categories 117 
creases. A comparison with the chi 
square test shows how seriously the 
latter may be in error when the num- 
ber of observations is small. 
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THE ANALYSIS OF PROFILE DATA 
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Vanderbilt University 


During the last 20 years, the crys- 
tal-ball-gazing test interpreter has 
gradually been supplanted by the 
profile-gazing tester. He gazes stead- 
fastly at the ups and downs on the 
profile chart for the Kuder Prefer- 
ence Record, the MMPI, and the 
Wechsler subtests; and from these he 
gives vocational advice, classifies the 
mentally ill, and searches for brain 
damage. Also, profile analysis has 
invaded psychological research, 
where comparisons are made between 
self and ideal-self ratings, and meas- 
ures are made of interpersonal per- 
ception. Being scientific folk, some 
psychologists reasoned that if profile 
analysis is used (later it will be argued 
that sometimes it is better not to), 
then it should be used ‘‘objectively,”’ 
i.e., in a mathematical and statistical 
framework. 

There are three kinds of questions 
that profile analysis needs to answer: 
1. How do you measure the rela- 
tive similarity of two profiles? 

2. How do you discriminate the 
typical profiles of two or more 
rad eg., MMPI profiles of dif- 
erent diagnostic groups? 
ve D do you "cluster" profiles 

o homogeneous groups? A sur- 
ond amount of controversy has 
ed over how to answer these ques- 

pe Haggard, Chapman, Isaacs, 
905 . 1959, for some pro- 

x solutions; see Sawrey, Keller, 
80 1 1960, for other proposed 
vie 58 and a review of the rele- 
tide V because 
nd € relatively simple, and sat- 
thin ae hope) answers to all 
of the e proposed answers to each 

5e will be discussed in turn. 


SIMILARITY OF PROFILES 


There are two principal criteria by 
which to judge any measure of rela- 
tionship: it should consider all of the 
information relevant to the com- 
parisons and it should have mathe- 
matical properties which permit 
powerful methods of analysis. The 
first is partly a matter of prefer- 
ence; but once the desired measure is 
formulated, it greatly influences the 
kinds of analyses that can be per- 
formed. 

Cronbach and Gleser (1953) re- 
viewed the many proposed measures 
of profile similarity, criticized most 
of them, and recommended the use 
of the d measure, which is the square 
root of the sum of squared differ- 
ences between profile elements. In an 
earlier paper, Osgood and Suci (1952) 
had also proposed the use of d. The 
argument for the use of d is that it 
considers all of the possible informa- 
tion in the profiles: level, shape, and 
dispersion. With respect to the first 
criterion given above for choosing a 
measure, d is appealing; and no one 
has proposed a more appealing 
measure. 

The d measure also stands up well 
with respect to the second criterion. 
By using a measure of interpoint 
distance in Euclidean space, powerful 


methods of analysis are indeed avail- 


able, ones which will be discussed 
more fully in the following sections. 
Because of these reasons, the author 
recommends, as others have, that 
profiles be considered as points in 
Euclidean space; however, as will be 
shown later, it is actually better to 
use a function of d rather than d itself 
in the analysis of profile data. 
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The use of d is appealing if, and 
only if, it is intended to compare pro- 
files simultaneously with respect to 
level, shape, and dispersion. Later 
in the article it will be argued that in 
some studies it would be more mean- 
ingful to equate all profiles for level, 
and in other studies to equate for 
both level and dispersion, in which 
cases it would be more appropriate 
to use covariances, and correlations, 
respectively, rather than d. It will be 
shown that the same, powerful 
methods can be used with "raw" 
profiles as can be used with covari- 
ances and correlations. 


DISCRIMINATION OF GROUPS 


If one accepts the Euclidean 
model, a powerful method of analysis 
is available for discriminating the 
typical profiles of two or more groups, 
namely, the linear multiple-discrimi- 
nant function (Tatsuoka & Tiede- 
man, 1954). This will provide the 
best (in a least-squares sense) linear 
combination(s) for discriminating 
the groups, and it offers a procedure 
for assigning new individuals to one 
of the groups. For example, the dis- 
criminant function could be applied 
to the problem of differentiating the 
MMPI profiles of paranoids, psycho- 
paths, and schizophrenics; and the 
results could be used to classify new 
cases into one of the three groups. 


CLUSTERING "Raw" PROFILES 


Clustering raw profiles is the prob- 
lem that has aroused so much dis- 
cussion, and the major purpose of this 
article is to attempt a satisfactory 
solution. This solution probably 
would have been adopted long ago 
had it not been for one mistaken 
notion among some psychologists 
about multivariate analysis. 

Let us set the problem in focus by 
imagining that we are studying 
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Fic. 1. Interpoint distances for six persons. 


MMPI profiles and that we have the 
profiles from a broad sample of 
psychotic patients. We want to 
study the interrelations among the 
raw profiles in such a way as to say 
how many kinds“ (clusters) of 
profiles there are, and we want to 
measure the extent to which each 
patient belongs to each cluster. 
First, we will assume that relation- 
ships among the profiles should be 
pictured as interpoint distances in 
Euclidean space. (Some arguments 
for so doing were given above.) 

In Figure 1 are pictured the 
hypothetical points for six patients, 
which are shown as lying in a two- 
space in order to simplify the illustra- 
tion. By arbitrarily designating the 
distance from Person a to Person b 
as 1, all of the interpoint distances 
are set, and these are presented in 
Table 1. 

In looking at Figure I and Table 1 
it is obvious that there are two clust- 
ters, defined, respectively, by Pa- 
tients a, b, and c, and by patients d, 
e, and f. If in actual research there 
were so few cases involved and such 
definite clusters were present, no T€ 
fined method of analysis would be 
needed; but this is almost never the 
case. A method of analysis will be 
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demonstrated which can recover 
these clusters and can be used equally 
well with any number of cases and 
regardless of the relative ‘‘visibility” 
of clusters. 

It is apparently not widely known 
that d matrices such as that in 
Table 1 can be factored. The method 
was derived by G. J. Suci (Osgood, 
Suci, & Tannenbaum, 1957). Suci 
and I cooperatively explored his 
method of factoring d and found it to 
be a special case of raw score factor 
analysis. This is where the major 
misconception arises: some psychol- 
ogists are evidently unaware that 
raw score cross-products can be fac- 
tored in the same way as correlation 
coefficients are factored. 

The failure to realize that factor 
analysis is not restricted to correla- 
tion coefficients is either directly 
evident or implied in many of the 
papers relating to methods of clus- 
tering profiles. Here is an example 
(Sawrey et al., 1960): 

Surely all factor analytic studies have not 
i interested in shape alone, yet this is, in 
ct, all that correlations, and consequently 


litalics added] factor analysis, takes into 
account (p. 670). 


4" Example of Raw Score Factor 
Analysis 


Because of the unfamiliarity of 
actoring raw score cross-products, a 


TABLE 1 


MATRIX or d's ror Points SHOWN 
IN FIGURE 1 


SST NINE AN) 00 


Person 


Person 
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worked-out example will be given. 
The first step is to obtain the sum of 
raw cross-products for each pair of 
patients over the profile elements. 
For the MMPI this consists of ac- 
cumulatively multiplying the scores 
on corresponding scales for each pair 
of patients. A hypothetical matrix of 
such cross-products corresponding to 
the d matrix in Table 1 is shown in 
Table 2. Because I have chosen an 
artifical example, the cross-product 
terms look different from what would 
be obtained from an actual study of 
MMPI profiles. 

How should one analyze Table 2 in 
order to obtain clusters? The answer 
is to factor analyze, and any of the 
methods commonly used with cor- 
relations can be applied: square root, 
multiple group, centroid, principal 
components, or what not. In doing 
this, the customary formulas are 
applied in the customary ways. Let 
us see what a centroid analysis pro- 
vides. 

For the first factor, sum the ele- 
ments in each column, find the 
square root of the sum of the column 
sums, and divide this into each of the 
column sums. These are loadings on 
the first centroid factor in the raw 
score space. Use the first factor load- 
ings to obtain a first set of residuals, 
reflect, extract a second set of cen- 
troid loadings, and continue in this 
manner until residuals are small“ or 
until enough factors have been ob- 
tained to satisfy the experimenter's 
curiosity. è à 

By choosing a set of points in a 
two-space, only two factors are 
needed to explain the cross-products, 
and, consequently, the second residu- 
als differ from zero only by rounding 
errors. Also, as would necessarily be 
the case, the sums of squares of “load- 
ings” in rows of the factor matrix are 
identical to the original diagonal ele- 
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ments in the cross-product matrix 
(which are the sums of squared scores 
over the profile elements for each 
patient). 

By applying the orthogonal trans- 
formation shown in Table 2, a ro- 
tated factor solution is obtained. The 
clusters shown in Figure 1 and Table 1 
are clearly evidenced in the rotated 
factor solution, and the factor load- 
ings tell how much each patient be- 
longs to each factor. In Figure 2 the 
rotated factors are plotted, and the 
obtained set of interpoint distances is 
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identical to that shown in Figure 1. 
If one wants to cluster profiles, raw 
score factor analysis is a powerful and 
directly applicable procedure. 


How Raw Score Analysis Works 


Elements in profiles (e.g., the Para- 
noid scale of the MM PI) can be con- 
sidered as mutually orthogonal axes 
in Euclidaen space. Each profile can 
be “plotted” as a point in the space, 
and d measures the distance of points 
from one another. 


TABLE 2 
Raw Score Cross-Propucts AND FACTOR SOLUTION FOR POINTS SHOWN IN FIGURE 1 


Cross-products 


Person Person 
a b c d e f 
a 36 30 30 6 6 12 
b 30 25 25 5 5 10 
c 30 25 26 11 10 15 
d 6 5 11 37 31 32 
e 6 5 10 31 26 27 
f 12 10 15 32 27 29 
Column sums 120 100 117 122 105 125 
First factor 4.58 3.81 4.46 4.65 4.00 4.77 
Person 
a 15.02 12.55 9.57 | —15.30 | —12.32 | — 9.85 
b 12.55 10.48 8.00 | —12.72 | —10.24 | — 8.17 
c 9.57 8.01 6.11 — 9.4 |— 7.84 |— 6.27 
dt eA BOE Ip 512172501 1:9.74 15.38 12.40 9.82 
e) PES Ne Panag Vices (ihe) ea . 12.40 10.00 7.92 
F 9.82 7.92 6.25 
Column sums 
after reflexion | 74.61 62.17 47.54 15.36 60.72 48.28 
Second factor | — 3.89 | — 3.24 | — 2.48 3.92 3.16 2.51 
17177 EQUI. i 
Person a b c d e f 
futuri ic E 
a TAS 05 — .08 — 05 Eo -.09 
i4 RAE n — 02 — 103 02 00 05 
l — 103 — 0⁴ — 02 00 -.05 
delest gs Lo —.02 — 07 ‘Ot m 
d —.03 .00 .00 ‘Ot ‘01 —.01 
f — 09 —.05 05 — 02 Et —.65 


- c ——————— 
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TABLE 2—Continued 


Person Centroid factors 

I II n 
a 4.58 —3.89 36 
b 3.81 —3.24 25 
c 4.46 —2.48 26 
d 4.65 3.92 37 
e 4.00 3.16 26 
f 4.77 2.51 29 

Transformation matrix 

A B 

763 647 
II —.647 763 

Rotated factors 
Person A B 

a 6.01 .00 
b 5.00 — .01 
c 5.01 1.00 
d 1.01 6.00 
e 1.01 5.00 
f 2.02 5.00 


Raw score factor analysis provides 
à basis (or semibasis) for the profile 
Spàce. Because any sufficient basis 
Preserves distances between points, 
the factor loadings preserve the orig- 
inal d's. In the example worked out 
n this can be tested by ob- 
pating d's from the two rotated fac- 
n This shows, for example, that 
1 between Persons a and b is, 
1 uu the limits of rounding errors, 
s is what was given in Table 1. 
eta all of the d’s can be calcu- 
pes tom the factor matrix. If fac- 
ee is not complete, then the factor 
duh X will serve to explain the bulk 
€ original distances. 
1 difficulty with most of the pro- 
measures of profile similarity is 


tha I 

t they are non-Gramian, e.g. 
LB On 

Whee. 8 8 a Gramian matrix is one 


ment; í 7 
(Hohn, 1958, ET of cross-products 


Cattell's r, (1949), and, consequently 
powerful methods of multivariate 
analysis cannot be used. Of course, 
matrices of cross-products are neces- 
sarily Gramian, and powerful meth- 
ods of multivariate analysis, such as 


LOADING ON FACTOR B 


00 fer QS AS 8 
LOADING ON FACTOR A 


Fic. 2. Loadings on rotated factors. 
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factor analysis, can be applied to 
them. Whenever there is a choice be- 
tween a number of descriptive meas- 
ures where one is Gramian and the 
others are not (e.g., the choice be- 
tween point biserial and biserial cor- 
relation), it greatly facilitates the 
analysis of results to choose the 
Gramian measure. 


Preparation for Analysis 


Much of the controversy about the 
analysis of profile data has concerned 
what, if any, transformations should 
be made before the data is analyzed. 
Regardless of what transformations 
are made, factor analysis of cross- 
product terms is a powerful method 
available to search for clusters. Two 
kinds of transformations have been 
proposed: transformations of distri- 
butions of individual differences on 
profile elements, and transformations 
of profiles as a function of intra- 
individual distributions. Each will 
be considered in turn. 

Individual differences. If the indi- 
vidual profile elements have grossly 
different standard deviations, some 
elements will contribute more to the 
interpoint distances than will others. 
For example, on the Rorschach, the 
number of F+ responses has a much 
larger standard deviation than the 
number of pure C responses, and, con- 
sequently, the former would more 
strongly influence the size of inter- 
point distances in a space of Ror- 
schach profiles. In many studies of 
profiles the elements have approxi- 
mately the same dispersions: MMPI, 
the Semantic Differential, and the 
subtests of the multifactor test bat- 
teries. When the standard deviations 
of profile elements are grossly dif- 
ferent, it is generally wise to equate 
them before using cross-products 
analysis to search for clusters. 

Profile elements differ not only in 
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terms of dispersions, they also differ 
in terms of their factor compositions. 
For example, profile elements on the 
Semantic Differential differ in terms 
of the factors of evaluation, potency, 
activity, and others. To the extent 
that one factor is more prominent 
than others in the collection of profile 
elements, that factor will more 
strongly influence the size of inter- 
point distances. One way to offset 
the differential influences of such 
factors is to factor analyze the profile 
elements (correlating over persons) 
and reduce the profiles to sets of fac- 
tor scores. Then the cross-products 
analysis of profiles can be made of the 
sets of factor scores. For example, in- 
stead of beginning with a space of 
interpoint distances formed by indi- 
vidual Semantic Differential scales, 
we can begin the analysis of profiles 
by constructing a space of Semantic 
Differential factor scores. Factor 
analysis of cross-products applies 
equally well in this situation, and it 
can be used regardless of the kinds of 
transformations that are made on 
profile elements. 

If the purpose of the analysis is to 
discriminate the typical profiles of 
two or more groups (Question 2, 
posed earlier), then nothing can be 
gained from transforming score dis- 
tributions on profile elements. The 
discriminant function will provide 
the same results whether or not the 
dispersions are equated. Also, the 
resolution of profile elements into 
factors cannot possibly add to the 
discriminability that would be ob- 
tained from a discriminant-functio 
analysis of the elements themselves 
However, a prior factor analysis 9 
profile elements is sometimes wise 
because it simplifies the subsequent 
use of the discriminant function 
leaves less opportunity for the dis- 
criminant function to “take adva™ 
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tage of chance," and usually makes 
the discriminant functions more in- 
terpretable. 

Intraindividual distributions. If, as 
some claim, profiles should be clus- 
tered with simultaneous respect to 
level, shape, and dispersion, then 
factor analysis should be made of raw 
cross-products, either on the un- 
transformed profile elements or after 
transformations of the kinds dis- 
cussed previously are made. 

If level is considered to be un- 
important in clustering profiles, then 
the means of all profiles should be 
equated before the analysis, prefera- 
bly equated to zero. Next form cross- 
product terms and divide each by the 
number of profile elements; then 
factor by any of the conventional 
methods. This is called covariance 
factor analysis, but it is only a special 
case of cross-products analysis. 

If both level and dispersion are con- 
sidered unimportant, convert all pro- 
files to standard scores, standardizing 
over the Profile elements. Then form 
a matrix of cross- products and divide 
each term by the number of profile 
elements. This gives a correlation 
matrix, and no one needs to be told 
that it can be factor analyzed. 

lf the purpose of the analysis is to 
discriminate the typical profiles of 
two or more groups (Question 2 posed 
earlier), it is an empirical question 
whether or not transformations of in- 
traindividual distributions will help 
doner the outcome in particular 
lar les. For example, if in a particu- 
: Study all of the profiles are 
nad for level, this might increase 
10 crminability or it equally well 

ay lower discriminability. Conse- 
iD before applying the dis- 
Eod Ant function, itis wise to com- 
dus Sroups with respect to level, 

$ p and dispersion. If groups 
T inconsequentially on any of the 


components, it is wise to remove that 
component(s) before applying the 
discriminant function. 


SHOULD PROFILES BE ANALYZED? 


Most of this article is concerned 
with how to analyze profile data. 
Equally important is the initial de- 
cision in research to make compari- 
sons among score profiles. Perhaps in 
many situations it would be wiser not 
to make such comparisons at all. 

The decision to use profile analysis 
is determined in part by preferences 
for methodologies, which are, in es- 
sense, wagersabout the likely research 
payoff in the long run from choosing 
one method of investigation rather 
than another. The reader can judge 
for himself whether the studies using 
comparisons of profiles (e.g., meas- 
ures of "assumed similarity" in inter- 
personal perception) have borne the 
expected fruit. 

If analyses are made of the rela- 
tions among raw profiles, in which 
level, shape, and dispersion are pre- 
served, the results are often difficult 
to interpret. Particular results may 
be due to any one of the three profile 
components, and, without reanalyz- 
ing differently, there is no way to un- 
ravel the puzzle. Even those who 
initially advocated the analysis of 
raw profiles have since either advo- 
cated or practiced separate analyses 
of level, shape, and dispersion (for 
example, Cronbach, 1958). $ 

It was argued that factor analysis 
of cross-products is the best way to 
cluster profiles. However, when such 
analyses are made of raw profiles, it 
is sometimes difficult to interpret the 
results. Most of us have become so 
familiar with correlation coefficients, 
and factor analyses of them, that it 
raises some anxiety to look at factor 
loadings like — 68.21 and 4.89. 

A good argument can be given for 
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the use of profile analysis in studies 
of personnel decisions, e.g., selecting 
men for a particular job, or classify- 
ing patients for different kinds of 
treatment. If criterion variables are 
available, the validity of any decision 
strategy based on profile analysis can 
be determined directly. Then the 
only sense in which it is necessary to 
justify the analysis of profiles is to 
show that it works better than some 
other approach. For example, it 
might be found that a discriminant- 
function analysis of score profiles is 
more effective in classifying mental 
patients than is a multiple regression 
approach. The major difficulty in 
validating profile analyses is that in 
many types of personnel decisions 
there are no adequate criteria avail- 
able, and the questions of whether to 
use profile analysis and, if so, how, 
are left moot. 
It is more difficult to argue for the 
use of profile comparisons in testing 
psychological theories. Many efforts 
have been made to assert hypotheses 
about interpersonal perception, psy- 
chotherapy, empathy, and others, in 
terms of profile similarities and dif- 
ferences. A major drawback to for- 
mulating such hypotheses is that they 
inevitably involve the semi-undefina- 
ble quality of “similarity.” Also, the 
general experience has been that the 
results of such studies often are much 
clearer when univariate comparisons 
rather than profile comparisons are 
made. This is illustrated in some of 
the studies of before-after therapy 
comparisons of profiles of “self” and 
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"ideal-self" ratings. What has gen- 
erally been found is that nearly 
everyone has the same ideal“ before 
therapy, and the ideal changes little 
during therapy. The change, if any, 
isin the self, and the change is mainly 
toward higher self-esteem (Rogers & 
Dymond, 1954, p. 417). Conse- 
quently, rather than assert vague and 
complex hypotheses about similari- 
ties among profiles before and after, 
it is much more meaningful to hy- 
pothesize that successful therapy 
raises self-esteem. Studies of inter- 
personal perceptions (for example, 
Bass & Fiedler, 1959) have also indi- 
cated that univariate comparisons 
often are more revealing than profile 
comparisons. 


SUMMARY 


Methods were suggested for han- 
dling three problems in the analysis 
of test profiles: measuring the simi- 
larity of profiles, discriminating the 
typical profiles of two or more groups, 
and clustering profiles into homoge 
neous groups. The suggested meth- 
ods were, respectively: picturing pro- 
files as interpoint distances in Euclid- 
ean space, use of the linear multiple- 
discriminant function, and factor 
analysis of profile cross-product 
terms. Some suggestions were give? 
about transformations of profile data 
before further analysis. Some op! 
ions were stated about the appropri- 
ateness of profile analysis in studies 
of personnel decisions and in investi- 
gations of psychological theories. 
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ON SIMPLE METHODS OF SCORING TRACKING ERROR 
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The thesis of this paper is that sim- 
ple measures of the error for one- 
dimensional tracking, provided the 
right ones are used, can reveal the 
response strategy which the subject 
(S) adopts without involving an inor- 
dinate amount of work scoring rec- 
ords. The measures are the mean 
constant error (CE), and the standard 
deviation (SD) of the error indicating 
within-S variability, computed sep- 
arately for position and time at 
various points on the input. Sam- 
ples of the measures can be obtained 
relatively easily and quickly by hand 
from a record like that in Figure 1. 
When an electronic display is used it 
is possible to produce such a record 
by feeding the input and response 
into two channels of an oscillographic 
recorder, and subsequently super- 
imposing them. 

No special merit is attached to 
measuring by hand. Once it has been 
decided which are the best measures 
to use, it is possible either to build 
electronic devices to do the measur- 
ing, or to record performance on for 
example magnetic tape, and to feed 
the tape into a computer which is 
first programed to produce one meas- 
ure, and then programed to produce 
another (Webber & Adams, 1960). 
However, electronic devices and com- 
puter programs only answer the ques- 
tions which the experimenter (E) 
asks of them; they cannot tell him 
what new questions to ask. Unless E 


1 The author is indebted to M. Stone for 
discussions on the statistical aspects of these 
problems, to J. A. Adams and E. J. Archer 
for constructive criticisms, and to the British 
Medical Research Council for financial sup- 
port. 
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has available almost unlimited re- 
sources for automatic data processing 
so that he can test out quite unlikely 
hypotheses, it may be advisable to 
make measurements by hand on 
sample records in order to be sure of 
not missing new and unexpected 
features. A simple clinical assess- 
ment from watching S perform or 
from serving as S may motivate E 
to make the necessary analyses, but 
is unlikely to tell him precisely what 
are the best measurements to make. 

The parallel approach using the 
describing functions of the engineer 
can be dismissed in a few words, 
since it is less relevant to psychology, 
and has been ably summarized by 
McRuer and Krendel (1957). De- 
scribing functions are based upon 
mathematical systems of analysis 
designed primarily to determine the 
numerical values of the parameters 
of servomechanisms. They are thus 
capable of giving exact numerical 
values to those aspects of human 
tracking performance which resemble 
the parameters of servomechanisms 
(Ellson, 1959). But they are not s0 
suited as simple measures are tOr 
determining the details of the ways 
in which human operators do not 
behave like servomechanisms; these 
include most of the phenomena 
studied by psychologists (Adams, 
1961, pp. 56-60). 


G 

Some SiwPLE METHODS or SCORIN 

Overall Error in Position í 

? 0 

One of the simplest measures i 
performance is the mean error 


EXE x = ing 
position neglecting the sign. Dur! 
World War II Craik and Vinc 
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INCLUDED IN OVERALL POSIT! 
BUT NOT IN OVERALL TIME ERO BE 


MOVEMENT IN INCHES 


O | 2 3 4 5 6 7 


TIME IN SECONDS 


Fic. 1. Secti i 

Edi (Rr, and E record from pursuit tracking in one dimension with a preview of 2.5 

ine e us at which the input, the undershooting response, and a comparable 

Esa. presented by the broken line, reversed direction, I; and Iz: points of 
input. Mi and Mi lie on the input half way in time between R and I, or Is.) 


er 5) recorded performance in 
NE oe tracking on a 
1 855 drum moving at 50 milli- 
B Uer minute, measured the error 
42 5 sition every 1.0 millimeter 
ie pore and took the mean 
in Pn ^ is equivalent to drawing 
adi a series of vertical lines 
E un the two functions and 
mh ing their mean height. (If a 
d eic is to be made for high fre- 
15 e in the response, 
e cessary to sample at a fre- 
heck at least double that of the 
Aale component. But when using 
little at of scoring there is 
fie fe int in sampling the error 
co Uy than once per second 
is ble acea a length of record 
esr ji or the only valid statis- 
P cius of whether the results are 
Mich an CM of the population from 
the diffe e Ss are drawn depends upon 
ps 1 9 between Ss; and there 
the dg beyond which increasing 

lability of the means of in- 


dividual Ss makes little difference to 
the variability between Ss). 

If it is desired to increase the 
penalty upon large errors, a function 
of squared error such as mean 
squared error or root mean squared 
error may be used instead of mean 
error. If small errors do not matter 
at all, a "target" area of a selected 
size may be used and error be scored. 
only beyond it. This is equivalent to 
increasing the width of the input 
function in Figure 1, and measuring 
the error from its edges. The method 
is somewhat analogous to measuring 
only the time for which SS is outside 
the designated target area (see time 
on target below). 

A major disadvantage of using only 
a measure of the overall error in posi- 
tion neglecting the sign, is that it 
confounds what are probably true 
errors of position with errors of tim- 
ing. The heights of the shaded areas 
in Figure 1 can be looked upon as rep- 
resenting true errors in position, 
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since S should have stopped when he 
reached the points at which the input 
reversed direction, but instead 
stopped short of them or went on too 
far. In contrast, over most of the 
distance between reversals the error 
can be looked upon as more in the 
nature of error in timing, since S$ 
covered the right ground, but did so 
either too early or too late. Measures 
of the overall error in position neg- 
lecting its sign include both these 
rather different sources of error. 

A measure which is probably more 
or less uncontaminated by errors in 
timing can be obtained by averaging 
the errors in position taking their 
signs into account. This mean CE 
shows the extent to which the re- 
sponse is on average to one side or 
other of the input. Unfortunately 
it gives no indication of whether S 
tended to overshoot or undershoot at 
reversals, since adjacent overshoots 
tend to cancel out in the side-to-side 
dimension, and the same applies to 
adjacent undershoots. The mean CE 
is not very contaminated by errors in 
timing provided the input is on aver- 
age symmetrical with respect to time 
and position (as harmonic inputsare), 
and provided the same is approxi- 
mately true of the response. For 
under these conditions the CEs for 
time tend to cancel out in the posi- 
tion dimension, and vice versa. 


Overall Error in Time 


During World War II Helson 
(1949, p. 477) used measures of time 
error in one-dimensional tracking as 
well as measures of position error. 
Measuring the error in time gives a 
mean lag if the sign of the error is 
taken into account, in addition to a 
mean error when the sign is neg- 
lected. In pursuit tracking with a 
reasonably random harmonic input 
of high frequency which cannot be 
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seen in advance, the two measures 
tend to give similar values since the 
response is rarely ahead of the input 
under these conditions (Poulton, 
1954, Table 3, fast course). However 
with a random low frequency or sim- 
ple harmonic input the mean lag may 
be relatively small compared with 
the mean error. Again a function of 
squared error can be used instead of 
mean error if it is desired to increase 
the penalty upon large errors. 

Taking the mean error in time 
neglecting the sign is equivalent to 
drawing in Figure 1 a series of hori- 
zontal lines connecting the two func- 
tions, and computing their mean 
length. Integrating the error in time 
with respect to position in this way 
gives the same result as integrating 
the error in position with respect to 
time, except in so far as S overshoots 
or undershoots at the reversals in the 
direction of movement of the input. 
Figure 1 shows that overshoots are 
not taken into account in the time 
dimension, since there is no input to 
which they correspond. Similarly 
undershoots leave a loop of the input 
without a corresponding part of the 
response function. Thus the shaded 
areas in Figure 1, which can be 
looked upon as predominantly error 
in position, are included in the over- 
all error in position, but not in the 
overall error in time. 

Just before an undershot reversal 
such as R in Figure 1, S is typically 
much behind the input, whereas just 
after the reversal he is typically 
much ahead of it. Conversely in 
overshooting, which is represented bY 
the broken line in Figure 1, 5 8 
typically first ahead of the input an 
then behind. The sign of the change 
in time error introduced by an under- 
shoot or overshoot before a G 
is the opposite of the sign o vee 
change introduced after the te. 
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versal. Thus if the sign of the time 
error is taken into account, as in 
calculating the overall lag, the change 
in the error just before the reversal 
tends to cancel the change just after 
the reversal. Mean lag is thus not 
appreciably affected by overshooting 
or undershooting. This is not the 
case for the mean error in time neg- 
lecting the sign, unless S consistently 
lags further behind the input than the 
sizes of the changes in time error in- 
troduced by overshoots and under- 
shoots. It can never be the case for 
overall measures involving squared 
time errors, since (L+c)?+(L—c)? 
z2L these measures are necessarily 
inflated by overshooting and under- 
shooting, even though they exclude 
the shaded areas in Figure 1. 

The failure to take account of the 
shaded areas in Figure 1 is a major 
disadvantage of using the overall 
error in time with sign neglected as 
the sole measure of performance in 
tracking inputs which reverse direc- 
tion. Time on target does not meet 
this particular difficulty. This meas- 
ure corresponds to increasing the 
thickness of the input function in 
Figure 1 to the width of the target 
area, and measuring the time for 
pus the response line lies within its 
2 p aries. However time on target 
8 es no account of the size of the 
excursions from the target area, and 
t exact meaning has been questione 
2 panics, Fitts, and Briggs (1957) 

ese and other grounds, 

B ven is a more sophis- 
adn technique for determining the 
1 ES time relationships between 
Hae and response. This involves 
et the two after moving the 
fed 2d by each of E number of 
Beds along the time dimension 
EUM Bennett, 1956). The size 
tet Een the response has to be 

orward or backward along 
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the time dimension in order to give 
the largest correlation with the in- 
put indicates S's overall lag or lead 
with respect to the input. 


Average Error at Particular Points on 
the Input 


With harmonic inputs average 
errors can be calculated at particular 
points such as reversals in direction, 
points of inflection, and points half 
way in time between reversals and 
points of inflection. Figure 1 shows 
that between reversals it is often im- 
possible to specify with any degree of 
certainty the corresponding points on 
the wiggly response record; thus the 
mean lag computed as described 
above is probably the most meaning- 
ful measure here. At the points of in- 
flection (I; and Iz in Figure 1) the 
mean lag is probably more or less un- 
contaminated by errors in position- 
ing, since these points are placed 
symmetrically on the input. However 
at the points half way in time be- 
tween reversals and points of inflec- 
tion (Mi and Ms in Figure 1) a 
tendency to undershoot increases the 
mean lag before reversals and cor- 
respondingly reduces it afterwards. 
A tendency to overshoot typically 
has an opposite though smaller effect. 
To the extent that the effect of over- 
shooting does not fully balance that 
of undershooting, simple variability 
in the overshoot-undershoot dimen- 
sion should have an effect similar to 
undershooting, though less marked. 

‘A reversal on the input (Point Rin 
Figure 1) can be compared directly 
with the corresponding reversal on 
the response function (Point r or 
.). At these points it is thus possible 
to obtain four separate measures: a 
CE and an SD of error related to the 
position of the response irrespective 
of its timing, and two similar meas- 
ures related to the timing of the re- 
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sponse without regard to its position. 
The sizes of the overshoots and un- 
dershoots (the heights of the shaded 
areas in Figure 1) probably provide 
the most useful measures in the posi- 
tion dimension. An alternative ver- 
sion of the CE in position, which 
shows only the extent to which the 
response is on average too far to one 
side or other of the input, is less rel- 
evant to the primary problem fac- 
ing S. 


An Illustration 


Table 1 shows results from an as 
yet unpublished experiment to illus- 
trate some of the more useful simple 
methods of scoring. Each entry in the 
table is based upon only 120 measure- 
ments, 10 from each of 12 Ss. The 
complete data for the Preview version 
thus required only 840 measure- 
ments, and the same is true of the Slit 
version. This has been done de- 
liberately in order to minimize labor, 
and to show how much can be dis- 
covered in spite of this. More gener- 


E. C. POULTON 


ous sampling calls if possible for 
automatic methods of scoring. 


METHOD 


Apparatus. An irregular curved input, of 
which Figure 1 shows a sample, was drawn on 
a paper tape which moved towards S at a 
rate of 1.0 inch per second. The frequencies 
in the input were 26 cycles per minute, 21 
cycles per minute, and a component of 10.5 
cycles per minute which had twice the 
amplitude of the other two. The maximum 
amplitude of movement of the input was 1.75 
inches. A ball-point pen was used as a 
stylus. The stylus could be moved in a slit ly- 
ing over the paper tape at right angles to it. 

Task. For the data in Table 1 S had to keep 
the stylus on the input by moving it from side 
to side in the slit. In the Preview version he 
could see the input 2.5 seconds (2.5 inches) 
ahead of the stylus, as a walker can normally 
see the footpath ahead. In the Slit version he 
could see the input only in the slit in which 
the stylus moved. The slit had a width of .1 
inch. 

Procedure. Each trial lasted 30 seconds. 
Half the Ss did the Preview version first, and 
half the Slit version. Data was also collected 
when a gap separated the input from the 
stylus, and S had to keep the two aligned, 
but it is not shown in Table 1. Practice was 
deliberately restricted, so that for the results 


TABLE 1 
Some SIMPLE METHODS or SCORING ERROR 


Preview version Slit version 
Points at which error 
measured CE SD CE — 
— M 
M | SE! M e M|SsE| M | SE 
— 
Error in position (mm.) 
Overall sample 0 13 1.61% .16 | 2R | .26 | 3.578] .3i 
Reversals .03L 715 .38L .32 
1.05%] 11 2.4% .25 
ioe 410 10 1.130 | .36 
Error in time (sec.) 
Overall sample .039 | .013 | 117% 010 .11» | .015 | 15% 013 
Reversals -064* | .011 | 101 .008 | .11* .016 12% | .008 
cycle after reversals | 011% .009 | .082^ 006 14> 015 .13» | O14 
Points of inflection 05 1 .013 „074 .006 | .13» 016 .12> | 014 
cycle before reversals | .082> .013 | .096 .010 | .12» .017 -12 .015 


Note.—L and R indicate that the response was too far 


undershoot or overshoot. The mean CEs for time were 
A Overall sample—Reversals p <.05 or better. 
Preview—Slit p <.05 or better. 
? Reversals—Points of inflection p <.05 or better. 


to the left or right, while U and O indicate a tendency f? 
all lags, 
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in the table each S had performed for alto- 
gether only between 2.0 and 5.0 minutes on 
each version. The amounts of practice on 
each version were counterbalanced between 


Ss. 

Subjects. These were 12 young enlisted men 
in the British Royal Navy, none of whom had 
done much tracking. 

Scoring. Each mean in the table is based 
upon 10 measurements from the record of 
each S. The measures at reversals, at points 
of inflection, and at the two series of points 
midway in time between reversals and points 
of inflection came from the second half of the 
30-second trial. At the three latter sets of 
points on the input the time error corresponds 
to the horizontal distance in Figure 1 between 
the two functions. At reversals the time and 
position error are, respectively, the difference 
between when and where the input reversed 
direction and when and where the response did 
so. Where S stopped for an appreciable time 
before moving off again in the opposite direc- 
tion, as near r in Figure 1, the time error is 
computed from the average of the time at 
which he stopped and the time at which he 
moved off again. 

The measures in the overall sample rep- 
resent performance averaged over all points on 
the input. The sample is based upon 10 points 
separated by 1.6 seconds, also from the second 
half of the trial, (This periodicity is not re- 
lated in a simple way to any of the three fre- 
ay components of the input, and makes 
i e sample cover approximately the same 
ength of record as the samples at particular 
points on the input.) Where an error in time 
could not be measured at the selected point be- 
cause S had undershot, the next point was 
taken instead. 
ton: _ The means SDs show the 
der S ility within Ss, the SEs the variability 
di ies Ss. Thus the SE of an SD indicates 
Di iis of the individual differences in vari- 
d y. The reliability of the differences be- 

n means was assessed by two-sample ¢ 


EM with 11 degrees of freedom, using two 


RESULTS AND DISCUSSION 


It has been suggested above that 
Performance at reversals probably 
a the most useful measures of 
ar errors in position. Table 1 shows 
a in the Preview version S tended 
. by an average of .41 
Sli imeters (5 <.002), whereas in the 

it version he tended to overshoot by 
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an average of 1.13 millimeters 
(p<.01). The SDs of the errors in 
position at reversals were over twice 
as large in the Slit version as with 
Preview (p<.001). In the overall 
sample, which shows performance 
averaged over all points on the input, 
the mean CE in position is unlikely to 
be very contaminated by time error, 
but the SD of the error in position 
necessarily contains an unknown 
component of time error. There was 
no significant difference between the 
overall sample and reversals in the 
extent to which the response was on 
average to one side or other of the 
input (p<.05), but as expected the 
SDs of the overall sample were sig- 
nificantly too large in both versions 
as compared with the SDs at re- 
versals (p <.01). 

As already indicated, the mean 
CEs in time at reversals are not con- 
taminated by errors in positioning, 
and the mean CEs of the overall sam- 
ple and at the points of inflection are 
unlikely to be very contaminated. 
The combined means of the CEs $ 
cycle before and after reversals are 
also unlikely to be very contami- 
nated, but a tendency to undershoot, 
and even simple variability in posi- 
tioning at reversals, is likely to in- 
crease the mean time lags before re- 
versals, ‘and to reduce correspond- 
ingly the mean lags after reversals. 
Table 1 shows that in the Preview 
version the mean time lag was twice 
as great at reversals as it was at the 
points of inflection on the input 
(p«.05). The average time lag at the 
intermediate points on the input 
cycle before and after revesals, 4 
seconds, lay intermediately. Thus in 
the Preview version S did not simply 
reproduce the input as accurately 
as he could with a constant time lag: 
his timing varied significantly at 
different points on the input cycle. 
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The nature of this nonlinear relation- 
ship would not have been so easy to 
determine using the describing func- 
tions of the engineer, since servo- 
mechanisms do not normally behave 
like this (see the introduction). 

In the Slit version there were no 
significant differences between the 
mean time lags at different points on 
the input (p.05). If differences 
exist, larger samples of data are re- 
quired in order to reveal them. None 
of the SDs of the error in time is an 
adequate measure of the overall 
variability in the time dimension. 
The SDs at reversals are likely to be 
smaller than the true overall vari- 
ability in timing, since only one 
point on the input is represented. 
However, unlike the remaining SDs 
in time in Table 1, the SDs at re- 
versals are not contaminated by 
variability in the position dimension; 
thus it is possible to make a valid 
comparison of the variability in tim- 
ing at these points between the two 
versions. Table 1 shows that there 
was significantly more variability in 
timing at reversals in the Slit version 
(p<.01), although the size of the 
difference was not large. 


Two Different Strategies 


The effect of size of preview was 
investigated in a previous experiment 
using the same input and apparatus 
as used here. Overall performance 
was found to change markedly when 
the preview was increased from zero 
(as in the Slit version) to .4 second; 
but there was no significant further 
change when the preview was ex- 
tended from .4 to 8.0 seconds (Poul- 
ton, 1954, p. 406). The 2.5-second 
preview used here was chosen to be 
well on that part of the function 

where overall performance had ceased 
to change appreciably with increase 
in preview. The differences between 
the Slit and Preview versions in Table 
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1 can thus be taken to represent the 
maximum effect which a change in 
preview is likely to produce. By com- 
paring the two versions it is possible 
to indicate the response strategy 
adopted in each case, and thus to 
demonstrate the usefulness of the 
simple measures given in the table. 

In the Preview version Table 1 
shows that the mean SD of the error 
in position at reversals was less than 
half the size of that in the Slit ver- 
sion. In addition the mean lag was 
twice as large at reversals as at the 
points of inflection on the input, 
whereas there was little difference in 
the Slit version. Thus in attempting 
to keep as much as possible on the 
input, S adopted the strategy in the 
Preview version of minimizing over- 
shoots and undershoots at reversals 
by approaching them more slowly 
than did the input itself, and catching 
up again at the start of the return 
movement. 

The stimulus conditions which 
presumably determined this strategy 
were as follows: close to the points of 
inflection on the input a small error 
in timing produced a considerable 
misalignment, since here the input 
was moving at its maximum velocity 
(see Figure 1). In contrast, close to 
reversals even a relatively large error 
in timing did not produce much 
misalignment provided the ampli- 
tude of the response was correct, since 
the input was more or less stationary; 
whereas an overshoot or undershoot 
not only produced a misalignment 
proportional to its size, but the ms 
alignment tended to remain for quite 
a time, since both input and response 
moved so slowly here (see Figure 1). 
Misalignment was thus minimized by 
concentrating upon correct timing 
near the points of inflection, an 
upon correct positioning near re- 
versals. 

In the Slit version Table 1 shows 
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that the SD of the error in position 
at reversals was over twice as large 
as in the Preview version. Also S 
overshot by an average of 1.13 milli- 
meters, as compared with a mean 
undershoot of only about one third 
the size in the Preview version. In 
the time dimension the mean lag in 
the overall sample was almost three 
times as large in the Slit version as 
with Preview, although it was by no 
means as long as a visual reaction 
time (RT) which is usually given as 
about .18 second (Woodworth, 1938, 
p. 324). In addition, at least at re- 
versals, the timing was rather more 
variable in the Slit version, although 
there was less difference than in the 
Preview version between the mean 
lag at one point on the input and 
another. 

The Slit version presented what 
was effectively an insoluble problem: 
S had to keep up with an irregular in- 
put which he could not see in ad- 
vance. In so far as he attempted to 
compensate for his RT he had there- 
fore to act on his predictions as to 
what the input was about to do, and 
thus to risk overshooting when the 
input stopped and reversed direction 
unexpectedly, and  undershooting 
when the input went on further than 
he expected. In a typical RT experi- 
ment his behavior would produce so- 
called "premature" or "false" re- 
actions. Faced with this problem, S 
adopted a strategy which was a com- 
Ee between on the one hand 

eeping up with the input regardless 
of overshooting and undershooting 
àt reversals, and on the other hand of 
Minimizing overshoots and under- 
shoots by following a full RT behind 
the input. 


SUMMARY 


For one-dimensional tracking sim- 
ple measures are described which can 
be made in terms of both position 
and time. The measures may be 
averaged over all points on the in- 
put, or may be averaged foronly one 
kind of point; e.g., at the reversals in 
direction, or at the points of inflection 
on the input. At reversals it is 
possible to score the error between 
the corresponding points on the in- 
put and response functions, and thus 
to produce measures of error in posi- 
tioning which are uncontaminated by 
errors in timing and vice versa. 

Overshoots and undershoots are 
probably the most relevant errors of 
positioning. The extent to which the 
response is on average to one side or 
other of the input, the overall lag or 
lead, and the mean lag or lead at the 
points of inflection on the input and 
at points situated symmetrically on 
either side of it, can all probably be 
computed in a reasonably uncon- 
taminated form. Most other meas- 
ures described confound more seri- 
ously errors of positioning with errors 
of timing. 

Some unpublished data are used to 
illustrate the various measures. They 
show a previously unre rela- 
tionship which would not have been 
so easy to specify using the describ- 
ing functions of the engineer. From 
the data it is possible to distinguish 
two different response strategies, 
which can be related to differences in 
stimulus conditions. The analysis 
demonstrates the increased insight 
into the stimuli influencing S in 
tracking, and into the strategies 
adopted, which can come from simple 
methods of scoring involving only a 
limited number of measurements. 
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The heterogeneity of schizophrenic 
patients and the lack of success in 
relating variable schizophrenic func- 
tioning to diagnostic subtypes (King, 
1954) have indicated the serious 
limitations of the current neuro- 
psychiatric classification of schizo- 
phrenia. In response to these 
limitations interest has arisen in a 
two-dimensional frame of reference 
for Schizophrenia. Such a conception 
is based on the patient's life history 
and/or prognosis. A number of 
terms—malignant-benign, dementia 
praecox-schizophrenia, chronic-ep- 
isodic, chronic-acute, typical-atyp- 
ical, evolutionary-reactive, true- 
schizophreniform, process-reactive— 
have appeared in the literature de- 
scribing these two syndromes. Proc- 
ess schizophrenia involves a long-term 
Progressive deterioration of the ad- 
justment pattern with little chance 
of recovery, while reactive schizo- 
Phrenia indicates a good prognosis 

d on a history of generally ade- 
quate social development with nota- 
ble Stress precipitating the psychosis. 
. In view of the current favorable 
interest in this approach to the un- 
Noe of schizophrenia (Rabin 
inn 1958) the present investiga- 
E: s esigned as an evaluative re- 

ew of the literature on the process- 
Teactive classification. 


EARLY PROGNOSTIC STUDIES 


m , Process-reactive distinction 
Bl i s implicit origin in the work of 
euler (1911). * Prior to this the 
pi dan influence had prevailed, 
AC ementia praecox considered an 
urable deteriorative disorder. 


Bleuler, while adhering to an organic 
etiology ſor schizophrenia, nonethe- 
less observed that some cases re- 
covered. This conclusion opened the 
field to a series of subsequent prog- 
nostic studies (Benjamin, 1946; 
Chase & Silverman, 1943; Hunt & 
Appel, 1936; Kant, 1940, 1941, 1944; 
Kretschmer, 1925; Langfeldt, 1951; 
Lewis, 1936, 1944; Malamud & 
Render, 1939; Mauz, 1930; Milici, 
1939; Paskind & Brown, 1940; Witt- 
man, 1941, 1944; Wittman & Stein- 
berg, 1944a, 1944b) eventuating in 
formalized descriptions of the process 
and reactive syndromes in terms of 
specific criteria. 

These early studies can be classified 
in three general catagories: studies 
correlating the outcome of a specific 
type of therapy with certain prog- 
nostic variables, studies descriptively 
evaluating prognostic criteria, and 
studies validating a prognostic scale. 

The first category is illustrated by 
the attempt of Chase and Silverman 
(1943) to correlate the results of 
Metrazol and insulin shock therapy 
with prognosis, using 100 schizo- 
phrenic patients treated with Metra- 
zol and 40 schizophrenic patients 
treated with insulin shock. 

In the first part of this study the 
probable outcome of each of the 150 
patients was estimated on the basis 
of prognostic criteria. The criteria 
considered of primary importance for 
a favorable prognosis were: short 
duration of illness, acute onset, ob- 
vious exogenic precipitating factors, 
early prominence of confusion, and 
atypical symptoms (marked by 
strong mixtures of manic-depressive, ~ 
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psychogenic, and symptomatic 
trends), and minimal process symp- 
toms (absence of depersonalization, 
derealization, massive primary per- 
secutory ideas, and sensations of in- 
fluence, conscious realization of per- 
sonality disintegration, bizarre delu- 
sions and hallucinations, marked 
apathy, and dissociation of affect). 
When these conditions were reversed 
the prognosis was least favorable. 
The following factors were considered 
less important for a favorable prog- 
nosis: history of previous illness, 
pyknic body type, extrovert tempera- 
ment and adequate prepsychotic life 
adjustment, catatonic and atypical 
subtypes. Asthenic body type, intro- 
version, inadequacy of prepsychotic 
reactions to life situations, onset of 
illness after the age of 40, and 
hebephrenic and paranoid subtypes 
were considered indicative of un- 
favorable prognosis. Age of onset 
under 40, sex, education, and abil- 
ities, and hereditary background were 
not considered of prognostic impor- 
tance. An analysis of the prognosti- 
cally significant factors resulted in 
the evaluation of the prognosis for 
each case as good, fair, or poor. 
Following termination of shock 
treatment all patients were followed- 
up for an average of 10 months and 
divided into three groups; much im- 
proved, improved, and unimproved. 
A comparison of the prognostic 
assessments with the results of shock 
indicated that of 43 cases in which 
the prognosis was considered good, 
33 showed remissions, while of 74 
cases with a poor prognosis, 63 did 
not improve. It was concluded that 
shock therapies were effective in 
cases of schizophrenia in which the 
prognosis was favorable, but were of 
little value when the prognosis was 
poor. 
The second part of the research in- 
volved a reanalysis of the prognostic 


WILLIAM G. HERRON 


criteria in the light of the results of 
shock treatment. Short duration of 
illness and the absence of process 
symptoms were the most significant 
factors for favorable outcome, while 
long duration of illness (more than 2 
years) and the presence of process 
symptoms were primary in determin- 
ing poor prognosis. 

A descriptive review of prognostic 
factors is seen in Kant's (1944) de- 
scription of the benign (reactive) 
syndrome as cases in which clouding 
and confusion prevail, or in which 
the schizophrenic symptoms centered 
around manic-depressive features 
or cases with alternating states of 
excitement and stupor with frag- 
mentation of mental activity. Malig- 
nant (process) cases are characterized 
by direct process symptoms. These 
include changes in the behavior lead- 
ing to disorganization, dulling and 
autism, preceding the outbreak of 
overt psychosis. The most subtle 
manifestation of this is the typical 
schizophrenic thought disturbance. 
The patient experiences the process as 
a loss of normal feeling of personal- 
ity activity and the start of exper 
iencing a foreign influence applied to 
mind or body. 

The third category includes the 
Elgin Prognostic Scale, construct 
by Wittman (1941) to predict re 
covery in schizophrenia. It is com 
prised of 20 rating scales weight z 
according to prognostic importance: 
favorable factors are weighted nega- 
tively, and unfavorable factors afe 
assigned positive weights. Init 
validation involved 343 schizophreme 
cases placed on shock treatment 
Wittman and Steinberg (19442) Per 
formed a follow-up study on 
schizophrenics and 156 manic- e 
pressive patients. The Elign scale 
proved effective in predicting 
outcome of therapy in 80-85% of the 
cases in both studies, and has been 
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utilized in the work of Becker (1956, 
1959), King (1958), and McDonough 
(1960) to distinguish the process- 
reactive syndrome. Included in the 
subscales of the Elgin scale are 
evaluations of prepsychotic personal- 
ity, nature of onset, and typicality 
of the psychosis relative to Krae- 
pelin's definition. 


STUDIES WITH DETAILED PROCESS- 
REACTIVE CRITERIA 


The synthesis of early studies is 
found in the research of Kantor, 
Wallner, and Winder (1953) establish- 
ing detailed criteria for distinguishing 
the two syndromes on the basis of 
case history material. A process 
patient would exhibit the following 
characteristics: early psychological 
trauma, severe or long physical ill- 
ness, odd member of the family, 
school difficulties, family troubles 
paralleled by sudden changes in the 
patient's behavior, introverted be- 
havior trends and interests, history 
of a breakdown of social, physical, 
and/or mental functioning, patho- 
logical siblings, overprotective or re- 
jecting mother, rejecting father, lack 
of heterosexuality, insidious gradual 
Onset of psychosis without pertinent 
Stress, physical aggression, poor re- 
Sponse to treatment, lengthy stay in 
the hospital, massive paranoia, little 
capacity for alcohol, no manic-de- 
Pressive component, failure under 
adversity, discrepancy between abil- 
ity and achievement, awareness of a 
Change in the self, somatic delusions, 
à clash between the culture and the 
environment, and a loss of decency. 

n contrast, the reactive patient has 
these characteristics: good psycho- 
logical history, good physical health, 
normal family member, well adjusted 
at school, domestic troubles unaccom- 
Panied by behavioral disruptions in 
the patient, extroverted behavior 
trends and interests, history of ade- 
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quate social physical, and/or mental 
functioning, normal siblings, nor- 
mally protective accepting mother, 
accepting father, heterosexual be- 
havior, sudden onset of psychosis 
with pertinent stress present, verbal 
aggression, good response to treat- 
ment, short stay in the hospital, 
minor paranoid trends, good capacity 
for alcohol, manic-depressive com- 
ponent present, success despite ad- 
versity, harmony between ability and 
achievement, no sensation of selí- 
change, absence of somatic delusions, 
harmony between the culture and the 
environment, and retention of de- 
cency. 

The first three criteria apply to the 
patient's behavior between birth and 
the fifth year; the next seven, be- 
tween the fifth year and adolescence; 
the next five, from adolescence to 
adulthood; the last nine, during 
adulthood. Using these 24 points to 
distinguish the two syndromes they 
tried to answer three questions: 

1. Do diagnoses based upon the 
Rorschach alone label as nonpsychot- 
ic a portion of the population of 
mental patients who are clinically 
diagnosed as schizophrenic? 

2. Can case histories of clinically 
diagnosed schizophrenics be differ- 
entiated into two categories: process 
and reactive? > 

3. Are those cases rated psychotic 
from the Rorschach classed as process 
on the basis of case histories, and are 
those cases judged nonpsychotic from 
the Rorschach classified as reactive 
from the case histories? 

Two samples of 108 and 95 patients 
clinically diagnosed as schizophrenic 
were given the Rorschach and rated 
according to the process-reactive 
criteria. In the first sample of 108 

tients, 57 were classified as psychot- 
ic and 51 nonpsychotic on the basis 
of the Rorschach alone, while in the 
second sample, of 74 patients who 
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could be rated as process or reactive, 
36 were classified as psychotic, and 
38 as nonpsychotic from their Ror- 
schach protocols. Those patients who 
were rated as reactive from their 
history were most often judged non- 
psychotic from the Rorschach, and 
those rated process from the case 
histories were most often judged as 
psychotic from the Rorschach. 
Only one judge was used in the 
second sample to rate the patients as 
process or reactive, but two judges 
were used in the first sample. Of the 
108 patients in this sample, both 
judges rated 86 cases, and were in 
agreement on 64 of these, which is 
greater than would be expected by 
chance. 
However, the accuracy of the 
schizophrenic diagnosis is question- 
able in this study. If the Rorschach 
diagnosis is followed, then it appears 
that reactive schizophrenics are not 
psychotic. Furthermore, thepsychiat- 
ric diagnosis appears to be somewhat 
contaminated because it was estab- 
lished on the basis of data collected 
by all appropriate services of the 
hospital, including psychological ex- 
aminations. A similar type of con- 
tamination may have been present in 
classifying patients as process or re- 
active because one judge had re- 
viewed each case previously and had 
seen psychological examination and 
history materials together prior to 
making his ratings. Three difficulties 
can be found with the criteria for 
process-reactive ratings. First, case 
histories are often incomplete and 
the patient is unable or unwilling to 
supply the necessary information. 
Second, it is difficult to precisely 
apply some of the criteria. For ex- 
ample, what is the precise dividing 
line between oddity and normality 
within the family? Third, in order to 
classify a patient it is necessary to set 
an arbitrary cut off point based on 
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the number of process or reactive 
characteristics a patient has. Sucha 
procedure needs validation. 

Nonetheless, the results of this 
study support the view that schiz- 
ophrenics can be classified as 
process or reactive, and that these 
syndromes differ in psychological 
functioning. 

Another rating scale which has 
been used extensively to distinguish 
prognostically favorable and prog- 
nostically unfavorable schizophrenics 
was developed by Phillips (1953). 
The scale was developed from the 
case histories of schizophrenic pa- 
tients who were eventually given 
shock treatment. The scale evaluates 
each patientin three areas: premorbid 
history, possible precipitating factors, 
and signs of the disorder. Premorbid 
history includes seven items on the 
social aspects of sexual life during 
adolescence and immediately beyond, 
seven items on the social aspects of 
recent sexual life, six items on per- 
sonal relations, and six items on re- 
cent premorbid adjustment in per- 
sonal relations. The sections of the 
scale which reflect the recent sexual 
life and its social history are the most 
successful in predicting the outcome 
of treatment. The items in the scales 
are arranged in order of increasing 
significance for improvement and 
nonimprovement away from the score 
of three, which is the dividing point 
between improved and unimprove 
groups. The premorbid history sub- 
scale has been utilized as the ranking 
instrument in the studies describe 
by Rodnick and Garmezy (1957; 
Garmezy & Rodnick, 1959). 

Another approach to the separa- 
tion of schizophrenics into prognostic 
groups uses the activity of the auto" 
nomic nervous system as the basis for 
division (Meadow & Funkenstei 
1952; Meadow, Greenblatt, Funken- 
stein, & Solomon, 1953; Meadow, 
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Greenblatt, & Solomon, 1953). 
Meadow and Funkenstein (1952) 
worked with 58 schizophrenic pa- 
tients tested for autonomic reactiv- 
ity and for abstract thinking. Fol- 
lowing therapy the patients were 
divided into two groups, good or poor, 
depending on the outcome of the 
treatment. The battery of psycho- 
logical tests included the similarities 
and block design subtests of the 
Wechsler-Bellevue scale, the Benja- 
min Proverbs test, and the object 
sorting tests. The physiological test 
involved the systolic blood pressure 
reaction to adrenergic stimulation 
(intravenous Epinephrine) and cho- 
linergic stimulation (intramuscular 
Mecholy!). On the basis of the physi- 
ological and psychological testing, 
schizophrenic cases were divided into 
three types: Type I, characterized by 
marked response to Epinephrine, low 
blood pressure, and failure of the 
blood pressure to rise under most 
Stresses, loss of ability for abstract 
thinking, inappropriate affect, and a 
poor prognosis; Type II, character- 
ied by an entirely different auto- 
nomic pattern, relatively intact ab- 
Stract ability, anxiety or depression, 
and a good prognosis; Type III. 
Showing no autonomic disturbance, 
relatively little loss of abstract abil- 
ity, little anxiety, well organized 
Paranoid delusions, and a fair prog- 
nosis. 

However, as Meadow and Funken- 
Stein (1952) point out, there is con- 
8 overlap of the measures 
us these types so that the classi- 
pon must be tentative. Also, of 

e psychological tests used, only 

Toverbs distinguished significantly 
oo the patients when they were 
: sified according to autonomic 
“activity, while Block Design failed 
a Brush significantly among 
m Y of the types. Further research us- 
Mg this method of division (Meadow, 
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Greenblatt, Funkenstein, & Solomon, 
1953; Meadow, Greenblatt, & Solo- 
mon, 1953) served as a basis for in- 
vestigations of the process-reactive 
syndromes by King (1958) and 
Zuckerman and Grosz (1959). 

King (1958) hypothesized that 
predominantly reactive schizophren- 
ics would exhibit a higher level of 
autonomic responsiveness after the 
injection of Mecholyl than predomi- 
nantly process schizophrenics. The 
subjects were 60 schizophrenics who 
were classified as either process or 
reactive by the present investigator 
and an independent judge using the 
criteria of Kantor et al. (1953). Only 
those subjects were used on which 
there was classificatory agreement. 
This resulted in 22 process and 24 
reactive patients. In order to con- 
sider the process-reactive syndrome 
as a continuum, 16 subjects were 
randomly selected from these two 
groups and were ranked by two inde- 
pendent raters. 

While the patient was lying in bed 
shortly after awaking in the morning 
the resting systolic blood pressure was 
determined. The patient then re- 
ceived 10 milligrams of Mecholyl 
intramuscularly, and the systolic 
blood pressure was recorded at in- 
tervals up to 20 minutes. Then the 
maximum fall in systolic blood pres- 
sure (MFBP) below the resting blood 
pressure following the injection of 
Mecholyl was computed for the 
different time intervals. There was a 
significant difference in the MFBP 
score for the reactives as compared 
with the normals. For the 16 sub- 
jects, the correlation between the sets 
of ranks on the process-reactive di- 
mension and MFBP was —.58. 

In a second part of the study 90 
schizophrenics, none of whom had 
participated in the first part, were 
classified as either process, process- 
reactive, Or reactive, using the cri- 
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teria of Kantor et al. (1953). On this 
basis the subjects were divided into 
three groups of 24. Also, scores for 22 
subjects were obtained on the Elgin 
Prognostic Scale, and 12 of these were 
rated independently by two raters. 
The MFBP scores were 17.04 for the 
process group, 22.79 for the process- 
reactive group, and 26.62 for the 
reactive. Using an analysis of vari- 
ance a significant F score occurs at 
the. Ol level. The correlation between 
the Elgin Prognostic Scale and the 
MFBP scores for 22 patients was 
—.49. 

Results of both parts of the study 
revealed that the patients classified 
as reactive exhibited a significantly 
greater fall in blood pressure after the 
administration of Mecholyl than the 
process patients. This evidence 
points to diminished physiological 
responsiveness in process, but not in 
reactive schizophrenia. However, 
Zuckerman and Grosz (1959) found 
that process schizophrenics showed a 
significantly greater fall in blood pres- 
sure following the administration of 
Mecholyl than reactives. Since these 
results contradict King's findings the 
question of the direction of respon- 
siveness to Mecholyl in these two 
groups requires further investigation 
before a conclusion can be reached. 


ProcEss-OrGANIC VERSUS RE- 
ACTIVE-PSYCHOGENIC 


Brackbill and Fine (1956) sug- 
gested that process schizophrenics 
suffer from an organic impairment 
not present in the reactive case. They 
hypothesized that there would be no 
significant differences in the inci- 
dence of “organic signs" on the 
Rorschach between a group of proc- 
ess schizophrenics and a group of 
known cases of central nervous sys- 
tem pathology, and that both organic 
and process groups would show sig- 
nificantly more signs of organic in- 
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volvement than the reactive group. 

The subjects consisted of 36 pa- 
tients diagnosed as process schizo- 
phrenics and 24 reactive schizophren- 
ics. The criteria of Kantor et al. 
(1953) were used to describe the pa- 
tients as process or reactive. Pa- 
tients were included only when there 
was complete agreement between 
judges as to the category of schizo- 
phrenia. Also included in the sample 
were 28 cases of known organic in- 
volvement. All patients were given 
the Rorschach, and the protocols were 
scored using Piotrowski's (1940) 10 
signs of organicity. 

Using the criterion of five or more 
signs as a definite indication of or- 
ganic involvement there was no sig- 
nificant difference between the or- 
ganic and process groups, but both 
groups were significantly different 
from the reactives. Considering 
individual signs, four distinguished 
between the reactive and organic 
group, while two distinguished be- 
tween process and reactive groups. 
The authors concluded that the re- 
sults supported the hypothesis that 
process schizophrenics react to a 
perceptual task in a similar manner 
to that of patients with central ner- 
vous system pathology. No specific 
hypothesis was made about individ- 
ual Rorschach signs, but color nam- 
ing, completely absent in the reac- 
tives, was indicated as an example o 
concrete thinking and inability t9 
abstract, suggesting that one of the 
critical differences between process 
and reactive groups is in terms of 2 
type of thought disturbance. 

This study does not provide de- 
tailed information about the mannes 
of establishing the diagnosis of schie A 
phrenia or about the judges deciding 
the process and reactive syndrome» 
Also, a further difficulty is the ad- 
mitted inadequacy of the organie 
signs, since 66% of cases with or ganic 
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pathology in this study were false 
negatives according to the Rorschach 
criteria. Thus while the existence of 
the process and reactive syndromes is 
supported by the results of this in- 
vestigation, there is less evidence of 
an organic deficit in process schizo- 
phrenics. 

Becker (1956) pointed out that the 
consistency of the prognostic findings 
in schizophrenia has led to postulat- 
ing two kinds of schizophrenia: proc- 
ess, with an organic basis, and reac- 
tive, with a psychological basis. He 
rejects this conclusion because re- 
search data in this area shows con- 
siderable group overlap, making it 
clinically difficult and arbitrary to 
force all schizophrenics into one 
group or the other. Also, if schizo- 
phrenia is a deficit reaction which 
may be brought about by any com- 
bination of 40 or more etiological 
factors, then the conception of two 
dichotomous types of schizophrenia 
is not useful. Finally, he maintains 
that 20 years of research have failed 
to find clear etiological differences 
between any subgroupings. 

Instead, Becker stated that process 
and reactive syndromes should be 
conceived as end points on a con- 
tinuum of levels of personality or- 
ganization. Process reflects a very 
Primitive undifferentiated personality 
Structure, while reactive indicates a 
more highly organized one. He hy- 
Pothesized that schizophrenics more 
nearly approximating the process 
syndrome would show more regres- 
Sive and immature thinking proc- 
esses than schizophrenics who more 
nearly approximate the reactive syn- 

tomes. His sample consisted of 51 
Schizophrenics, 24 males and 27 fe- 
i all under 41 years of age. 
el s processes were evalu- 
B. y the Rorschach and the 
msn Proverbs test. The 1937 

nford-Binet vocabulary test was 
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used to estimate verbal intelligence. 
A Rorschach scoring system was used 
which presumably reflected the sub- 
jects’ level of perceptual develop- 
ment, while a scoring system was 
devised for the Proverbs which re- 
flected levels of abstraction, Since 
there is a high relationship between 
intelligence and ability to interpret 
proverbs, a more sensitive index of a 
thinking disturbance was considered 
to be a discrepancy score on 
the standard score difference between 
a vocabulary estimate of verbal in- 
telligence and the proverbs score. 
Process and reactive ratings were 
made on the Elgin Prognostic Scale. 
The Rorschach mean perceptual 
level score and the Elgin Prognostic 
Scale correlated —.599 for men and 
—.679 for women, indicating a sig- 
nificant relationship between the 
process-reactive dimension as evalu- 
ated from case history data and 
disturbances of thought processes as 
measured by the Rorschach scoring 
system. The proverbs-vocabulary 
discrepancy score was significantly 
related to the process. reactive dimen- 
sion for men, but not for women. No 
adequate explanation was found for 
this sex difference, which mitigates 
the results. A further difficulty oc- 
curs because the case history and test 
evaluations were made by the same 
person. However, the results in part 
support the hypothesis, indicating 
evidence for a measurable dimension 
of regressive and immature thinking 
related to the process-reactive di- 
mension. 3 
McDonough (1960), acting on the 
assumption that process schizo- 
phrenia involves central nervous sys- 
tem pathology specifically cortical in 
nature, hypothesized that brain dam- 
aged patients and process schizo- 
phrenics would have significantly 
lower critical flicker frequency (CFF) 
thresholds and would be unable to 
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perceive the spiral aftereffect signifi- 
cantly more often than reactive schiz- 
ophrenics and normals. Four groups 
of 20 subjects each were tested. The 
organic group consisted of individuals 
with known brain damage. One 
hundred and sixty-one schizophrenic 
case histories were examined, and 76 
were chosen from this group to be 
rated on the Elgin Prognostic Scale. 
The 20 patients receiving the lowest 
point totals were selected as being 
most reactive, while those with the 20 
highest scores were considered most 
process. 

Results of the experiment revealed 
that organic patients were signifi- 
cantly different from all other groups 
in CFF threshold and ability to per- 
ceive the spiral aftereffect. Process 
and reactive schizophrenics did not 
differ from each other on either task, 
butreactiveschizophrenicshad higher 
CFF thresholds than normals. These 
results do not indicate demonstrable 
cortical defect in either process or 
reactive schizophrenia. 


Process-Poor PREMORBID His- 
TORY VERSUS REAcTIVE-Goop 
PREMORBID History 


Rodnick and Garmezy (1957), dis- 
cussing the problem of motivation in 
schizophrenia, reviewed a number of 
studies in which the Phillips prog- 
nostic scale was used to classify 
schizophrenic patients into two 
groups, good and poor. For example, 
Bleke (1955) hypothesized that pa- 
tients whose prepsychotic life adjust- 
ment was markedly inadequate would 
have greater interferencesand so show 
more reminiscence following censure 
than patients whose premorbid his- 
tories were more adequate. 

The subjects were presented with a 
list of 14 neutrally toned nouns pro- 
jected successively on a screen. Each 
subject was required to learn to these 
words a pattern of pull-push move- 
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ments of a switch lever. For half the 
subjects in each group learning took 
place under a punishment condition, 
while the remaining subjects were 
tested under a reward condition, 
The subjects consisted of 40 normals, 
20 poor premorbid schizophrenics, 
and 20 good premorbid schizophren- 
ics. The results confirmed the hy- 
pothesis. 

A reanalysis of Dunn's (1954) data 
indicated that a poor premorbid 
group showed discrimination deficits 
when confronted with a scene depict- 
ing a mother and a young boy being 
scolded, but good premorbid and 
normal subjects did not show this 
deficit. 

Mallet (1956) found that poor pre- 
morbid subjects in a memory task for 
verbal materials showed significantly 
poorer retention of hostile and non- 
hostile thematic contents than did 
good premorbid and normal subjects. 
Harris (1955) has found that in con- 
trast to goods and normals poor 
premorbids have more highly deviant 
maternal attitudes. They attribute 
more rejective attitudes to their 
mothers, and are less able to critically 
evaluate their mothers. Harris (1957) 
also found differences among the 
groups in the size estimation O' 
mother-child pictures. The poors 
significantly overestimated, while the 
goods underestimated, and the nor- 
mals made no size error. 

Rodnick and Garmezy (1957) re- 
ported a study using Osgood's (1952) 
semantic differential techniques in 
which six goods and six poors rated 20 
concepts on each of nine scales se- 
lected on the basis of high loadings o 
the evaluative, potency, and activity 
factors. Good and poor groups dif- 
fered primarily on potency and ac 
tivity factors. The poors descri 
words with negative value, as more 
powerful and active. The goods co 
discriminate among concepts, but the 
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poors tended to see most concepts as 
powerful and active. 

Rodnick and Garmezy (1957) also 
investigated differences in authority 
roles in the family during adolescence 
in good and poor premorbid patients. 
While results were tentative at that 
time, they suggested that the mothers 
of poor premorbid patients were per- 
ceived as having been more dominat- 
ing, restrictive, and powerful, while 
the fathers appeared ineffectual. The 
pattern was reversed in the good pre- 
morbid patients. 

Alvarez (1957) found significantly 
greater preference decrements to 
censured stimuli by poor premorbid 
patients. This result was consistent 
with the results of Bleke’s (1955) and 
Zahn's (1959) observations of re- 
versal patterns of movement of a 
switch lever following censure. These 
experiments suggested an increased 
sensitivity of the poor premorbid 
schizophrenic patient to a threaten- 
ing environment. 

These studies reported by Rodnick 
and Garmezy (1957) indicated that it 
was possible, using the Phillips scale, 
to effectively dichotomize schizo- 
phrenic patients. However, the 
Phillips scale had predictive validity 
only when applied to male patients. 
Within this form of reference it was 
also possible to demonstrate differ- 
ences between goods and poors in 
Tesponse to censure, and in percep- 
tion of familial figures. Variability in 
the results of schizophrenic perform- 
ance was considerably reduced by 
dichotomizing the patients, but it was 
often impossible to detect significant 
differences between the performance 
of good premorbid schizophrenics 
ànd normals. Rodnick and Garmezy 
(1957) suggest that the results be 
Considered as preliminary findings 
Pending further corroboration, though 
Providing support for the concept of 
Premorbid groups of schizophrenics 
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differing in certain psychological di- 
mensions. 


Process-Reactive EMPIRICAL- 
THEORETICAL FORMULATIONS 


Fine and Zimet (1959; Zimet & 
Fine, 1959) used the same population 
employed by Kantor et al. (1953) 
and the same criteria for distinguish- 
ing the process and reactive patients, 
For this study only those cases were 
included where there was complete 
agreement among the judges as to the 
category of schizophrenia. They 
studied the level of perceptual or- 
ganization of the patients as shown 
on their Rorschach records. The 
process group was found to have 
significantly more immature, regres- 
sive perceptions, while the reactive 
group gave more mature and more 
highly organized responses. The 
findings indicated that archaic and 
impulse-ridden materials break 
through more freely in process schizo- 
phrenia, and that there is less ego 
control over the production of more 
regressive fantasies. Zimet and Fine 
(1959) speculated that process schizo- 
phrenia mirrors oral deprivation of 
early ego impoverishment, so that 
either regression or fixation to an 
earlier developmental stage is re- 
flected in his perceptual organization. 
In contrast, it is possible that the re- 
active schizophrenic's ego weakness 
occurs at a later stage in psychosexual 
development, and any one event may 
reactivate the early conflict. 

An amplification of the process- 
reactive formation has been suggested 
by Kantor and Winder (1959). They 
hypothesized that schizophrenia can 
be understood as a series of responses 
reflecting the stage of development in 
the patient's life at which emotional 
support was severely deficient. Schiz- 
ophrenia can be quantitatively de- 
picted in terms of the level in life to 
which the schizophrenic has regressed, 
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and beyond which development was 
severely distorted because of disturb- 
ing life circumstances. The earlier in 
developmental history that severe 
stress occurs, the more damaging the 
effect on subsequent interpersonal 
relationships. Sullivan (1947) sug- 
gested five stages in the development 
of social maturity: empathic, proto- 
taxic, parataxic, autistic, and syn- 
taxic. The most malignant schizo- 
phrenics are those who were severely 
traumatized in the empathic stage of 
development when all experience is 
unconnected, there is no symbolism, 
and functioning is at an elementary 
biological level. The schizophrenic 
personality originating at this stage 
may show many signs of organic 
dysfunction. Prognosis will be most 
unfavorable, and delusional forma- 
tion will tend to be profound. 

In view of the primitive symbolic 
conduct and the lack of a self-concept 
in the prototaxic stage, the schizo- 
phrenic personality referable to this 
stage will be characterized by magi- 
cal thinking and disturbed communi- 
cation. The delusion of adoption 
often occurs. However, these patients 
are more coherent than those of the 
previous level. 

The parataxic schizophrenic state 
involves the inability of the self- 
system to prevent dissociation. The 
autonomy of the dissociations result 
in the patient's fear of uncontrollable 
inward processes. Schizophrenic 
symptoms appear as regressive be- 
havior attempting to protect the self 
and regain security in a threatening 
world. Delusional content usually 
involves world disaster coupled with 
bowel changes. Nihilistic delusions 

are common. While there is evidence 
of a self-system in these patients, 
prognosis remains unfavorable. 
The patient who has regressed to 
the autistic stage, although more 
reality oriented than in the previous 
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stages, is characterized by paranoid 
suspiciousness, hostility, and patho- 
logical defensiveness against inade- 
quacy feelings. A consistent system 
of delusions will be articulated and 
may bring the patient into conflict 
with society. However, prognosis is 
more favorable at this stage than 
previously. 

An individual at the syntaxic level 
has reached concensus with society, 
so that if schizophrenia occurs it will 
be a relatively circumscribed reac- 
tion. Onset will be sudden with 
plausible environmental stresses, and 
prognosis is relatively good. 

Becker (1959) also elaborated on 
the lack of a dichotomy in schizo- 
phrenia. Individual cases spread out 
in such a way that the process syn- 
drome moves into the reactive Syn- 
drome, so that the syndromes prob- 
ably identify the end points of a 
dimension of severity. At the process 
end of the continuum the develop- 
ment of personality organization is 
very primitive, or involves severe 
regression. There is a narrowing of 
interests, rigidity of structure, an 
inability to establish normal hetero- 
sexual relationships and independ- 
ence. In contrast, the reactive end o 
the continuum represents a higher 
level of personality differentiation. 
The prepsychotic personality is more 
normal, heterosexual relations are 
better established, and there is greater 
tolerance of environmental stresses 
The remains of a higher develoP- 
mental level are present in regression 
and provide strength for recovery: 

Becker (1959) factor analyzed 
some of the data from his previous 
study (Becker, 1956). The factore 
matrix included a number of back- 
ground variables, the 20 Elgin Prog 
nostic Scale subscores, and à 
schach genetic level score (GL) ba 
on the first response to each care” 
Seven centroid factors were extracte 


or- 
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from the correlation matrix. Factors 
4, 6, and 7 represented intelligence, 
cooperativeness, and marital status 
of parents, respectively. The highest 
loadings on Factor 5 were history of 
mental illness in the family, excellent 
health history, lack of precipitating 
factors, and clouded sensorium. The 
Rorschach GL score and the Elgin 
scales did not load significantly on 
Factors 4 through 7. 

The remaining three factors paral- 
lel the factors Lorr, Wittman, and 
Schanberger (1951) found with 17 of 
the 20 Elgin scales using an oblique 
solution instead of the orthogonal 
solution used in this study. Factor 1 
is called schizophrenic withdrawal, 
loading on defect of interest, insidious 
onset, shut-in personality, long dura- 
tion of psychosis, and lack of pre- 
cipitating factors. At one end this 
factor defines the typical process 
syndrome, while the other end de- 
scribes the typical reactive syndrome. 
The Rorschach GL score loaded —.46 
on Factor 1. 

Factor 2, reality distortion, loads 
on hebephrenic symptoms, bizarre 
delusions, and inadequate affect. 
Rorschach GL score loaded —.64 on 
this factor. Factor 3 loaded on in- 
differenceand exclusiveness-stubborn- 
ness. The opposite pole of this factor 
involves insecurity, inferiority, self- 
consciousness, and anxiety.  Ror- 
Schach GL score loaded .25 on this 
factor. 

Further analysis indicated that 
when Factors 1 and 2 were plotted 
against each other an oblique rota- 
tion was required, introducing a cor- 
relation of from .60 to .70 between 
Schizophrenic withdrawal and reality 
distortion factors. Similar oblique- 
ness was found between Factors 2 
and 3, suggesting the presence of a 
Second-order factor. 

However, the sampling of behavior 
Manifestations in the Elgin scale 


overweights the withdrawal factor, 
which gives Factor 1 undue weight 
and biases the direction of a second- 
order factor toward the withdrawal 
factor. Also, it is not possible to ac- 
curately locate second-order factors 
with only seven first-order factors as 
reference points. In addition, sample 
limited inferences about a 

order factor. There is the suggestion, 
however, of the existence of a 
severity factor, loading primarily 
schizophrenic withdrawal and reality 
distortion. 

The author suggests utilizing the 
evidence from this study to form an 
index of severity of psychosis which 
could be used to make diagnoses with 
prognostic significance. This diag- 
nostic procedure would include factor 
estimates of schizophrenic withdrawal 
and emotional rigidity, based on 
Elgin scale ratings, and reality distor- 
tion, based on the Rorschach GL 
score. 

Garmezy and Rodnick 1 (1959) 
pointed out that despite failure to 
find support for a fundamental bio- 
logical deviation associated with 
schizophrenia (Kety, 1959), the view 
of schizophrenia as a dichotomous 
typology influenced either by somatic 
or psychic factors has continuously 
been advanced. They maintain that 
on the basis of empirical evidence 
there is little support for a process- 
organic versus reactive-psychogenic 
formulation of schizophrenic etiology. 

Reviewing a series of studies using 
the Phillips scale as a dichotomizing 
instrument (Alvarez, 1957; Bleke, 
1955; Dunham, 1959; Dunn, 1954; 
Englehart, 1959; Farina, 1960; 
Garmezy, Stockner, & Clarke, 1959; 
Harris, 1957; Kreinik, 1959; Rodnick 
& Garmezy, 1957; Zahn, 1959) 
Garmezy and Rodnick concluded 
that the results indicate two groups of 
schizophrenic patients differing bo 
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in prognostic potentialand sensitivity 
to experimental cues. There is an 
interrelationship among the variables 
of premorbid adequacy, differential 
sensitivity to censure, prognosis, and 
types of familial organization. This 
suggests a relationship between vary- 
ing patterns of early experience and 
schizophrenia, though it does not 
embody the acceptance of a given 
position regarding psychological or 
biological antecedents in  schizo- 
phrenia. 

Reisman (1960), in an attempt to 
explain the heterogeneous results of 
psychomotor performance in schizo- 
phrenics, suggested that there were 
two groups of schizophrenics, process 
and reactive, differing in motivation. 
'The process group was seen as more 
withdrawn and indifferent to their 
performance, and consequently re- 
flecting a psychomotor deficit not 
present in reactives. In order to test 
this hypothesis 36 reactives, 36 proc- 
ess patients, and 36 normals per- 
formed a card-sorting task. The 
groups were distinguished according 
to the criteria of Kantor, Wallner, 
and Winder (1953). On Trial 1 all 
subjects were requested to sort as 
rapidly as possible. Then the sub- 
jects were assigned to one of four 
experimental conditions, with an at- 
tempt made to equate across the 
experimental conditions for age, esti- 
mated IQ, length of hospitalization, 
and initial sorting time. Condition 1 
(FP) involved sorting the cards seven 
more times and if the sort was fast 
the subjects were shown stress- 
arousing photographs. If they sorted 
slowly no photographs were shown. 
Condition 2 (SP) was the reverse of 
this. Condition 3 (FL) and Condi- 
tion 4 (SL) were similar to the first 
two conditions except that a nonrein- 
forcing light was used instead of the 
pictures. After Trial 8 all subjects 

were informed that there would be no 


more pictures or light, but were asked 
to sort rapidly for three more trials, 
With four conditions on Trials 2 
through 8, 10 subjects from each of 
the three groups participated in each 
of the two picture conditions, while 
eight subjects from each group par- 
ticipated in each of the light condi- 
tions. 

The results indicated that the nor- 
mals performed about the same under 
all conditions. The process group 
under FP sorted as fast as normals, 
but performed slowly under the other 
three conditions, while the reactives 
were slowest under FP but were as 
fast as normals under the other three 
conditions. Within all three groups 
performance under FL did not differ 
significantly from performance under 
SL. Under FL and SL, however, re- 
actives and normals sorted more 
rapidly than the process group. These 
results supported the hypothesis of a 
motivational deficit for process schiz- 
ophrenics. The results also indi- 
cated that the pictures were nega- 
tively reinforcing for the reactives, 
while the process patients were moti- 
vated to see them. This suggested 4 
withdrawal differential. The with- 
drawal of the process patients is of 
such duration that supposedly threat- 
ening photographs cause little anxi- 
ety. In contrast, reactive withdrawal 
is motivated by an environment that 
recently became unbearable. Con- 
fronted with pictures representing 
this environment the reactive patient 
experiences anxiety and avoidance. 
However, the results of this expert, 
ment are in contrast to the findings o 
Rodnick and Garmezy (1957) that 
prolonged exposure to social censure 
will result in greater sensitivity t9 
that stimulation. 


SUMMARY 


. . n 
This review of all the research E 
the process-reactive classification 
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schizophrenia strongly indicates that 
it is possible to divide schizophrenic 
patients into two groups differing in 
prognostic and life-history variables. 
Using such a division it is also pos- 
sible to demonstrate differences be- 
tween the two groups in physiological 
measures and psychological dimen- 
sions. 

The result of such an approach has 
been to clarify many of the hetero- 
genous reactions found in schizo- 
phrenia. It also appears that the 
dichotomy is somewhat artificial and 
really represents end points on a 
continuum of personality organiza- 
tion. The most process patient rep- 
resents the extreme form of personal- 
ity disintegration, while the most 
reactive patient represents the ex- 
treme form of schizophrenic integra- 
tion. The reactions of this type of 
patient are often difficult to distin- 
guish from behavior patterns of nor- 
mal subjects. There does not appear 
to be any significant evidence to sup- 
port the contention of a process- 
organic versus a reactive-psychogenic 
formulation of schizophrenic etiol- 
ogy. 

It is difficult to decide on the 
most appropriate criteria for selecting 
schizophrenic subjects so as to reduce 
their response variability. Prefer- 
ences are generally found for one of 
three sets of criteria: Kantor, Wallner, 
and Winder's (1953) items, the Elgin 
Prognostic Scale (1944), or the 
Phillips scale (1953). The criteria of 
Kantor et al. (1953) does not provide 
à quantitative ordering of the vari- 
ables, and is descriptively vague in 
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several dimensions as well as depend- 
ing upon life history material which 
is not always available. While the 
Elgin scale does provide a quantita- 
tive approach, it also has the disad- 
vantages of descriptive vagueness 
and excessive dependence upon life 
history material. The Phillips scale 
eliminates some of these difficulties, 
but its validity is limited to the ade- 
quacy or inadequacy of social-sexual 
premorbid adjustment. The need for 
more feasible criteria may be met by 
the factor analysis of pertinent vari- 
ables to obtain a meaningful severity 
index (Becker, 1959), or by using 
rating scales in which the patient 
verbally supplies the necessary in- 
formation. An example of the latter 
is the Ego Strength scale (Barron, 
1953), recently utilized in distinguish- 
ing two polar constellations of schizo- 
phrenia; a process type with poor 
prognosis and grossly impaired ab- 
stract ability, and a reactive type 
characterized by good prognosis and 
slight abstractive impairment (Her- 
ron, in press). 

This need for more efficient differ- 
entiating criteria mitigates some of 
the significance of present findings 
using the process-reactive dimension. 
Nonetheless, the process-reactive re- 
search up to this time has succeeded 
in explaining schizophrenic hetero- 
geneity in a more meaningful manner 
than previous interpretations adher- 
ing to various symptom pictures and 
diagnostic subtypes. Consequently, 
there appears to be definite value in 
utilizing the process-reactive classifi- 
cation of schizophrenia. 
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Hypnotic research can be broadly 
characterized as having either an in- 
trinsic or instrumental orientation. 
Intrinsically oriented research is con- 
cerned with the phenomena and 
nature of hypnosis itself whereas in- 
strumentally oriented research util- 
izes hypnosis to produce some condi- 
tion which is the object of study, e.g., 
personality alteration, psychopathol- 
ogy. Although both research orienta- 
tions present difficult methodological 
problems, this communication will 
limit itself to problems associated 
with the instrumental use of hypno- 
sis. 

Despite its ability to command en- 
during interest, instrumental hypnot- 
ic research has remained relatively 
inconsequential and isolated. One of 
the principle reasons for this state of 
affairs is the lack of criteria for de- 
termining the relevance of hypnoti- 
cally induced behavior to clinical or 
natural behavior. In the absence of 
adequate criteria, the data of in- 
strumental hypnotic research tend to 
be either rejected or consigned to the 
limbo of ambiguity. Adams (1957), 
in his review of laboratory studies of 
behavior without awareness, ex- 
cluded studies involving posthypnot- 
ic suggestion, automatic writing, 
extrasensory perception, and proc- 
esses of which the subject is unaware. 
A paralyzing caution was displayed 
by Ainsworth (1954) in her review of 
Rorschach validation research: 
Hypnosis provides another method for 

artifically altering the state of the subject 
while undergoing the Rorschach examination, 
although hypnotic studies are open to the 
question of whether the hypnotically induced 
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state is comparable enough to the "genuine" 
state to provide validation evidence (p. 480). 


Most reviewers, however, do not 
even bother to mention the exclusion 
of hypnotic research. At the risk of 
being charitable, it is likely that re- 
jecting attitudes toward  instru- 
mental hypnotic research arise more 
from the lack of criteria for determin- 
ing relevance than from prejudice. In 
lieu of such criteria, most investiga- 
tors have approached this issue by 
ignoring it or by assuming that the 
induced behavior is equivalent in all 
respects to its natural counterpart. 
Phenotypic identity, however, does 
not necessarily imply genotypic iden- 
tity; ie. the fact that behavior 
similar to anxiety can be produced by 
hyposis does not mean that the 
mechanisms of hypnosis are the same 
or similar to the processes underlying 
clinical anxiety. Weitzenhoffer (1953) 
has pointed out that hypnotically 
induced phenomena resembling 
psychodynamic manifestations are 
apt to lack affective tone. He also 
recognized the importance of induc- 
ing an appropriate genotype by his 
assertion that affective tone is most 
apt to be absent when the sugges- 
tions are aimed at directly bringing 
about the overt manifestations rather 
than creating the type of factors 
normally responsible for these” (p. 
217). 

The topic of hypnotically induced 
psychopathology will serve as the 
focus of the inquiry because it high- 
lights both the methodological an 

conceptual problems involved in the 
laboratory investigation of hypnot 
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ically induced conditions. Such a 
focus achieves enhanced significance 
because the genotypic-phenotypic 
relationships that constitute psycho- 
pathology represent one of the cen- 
tral problems in most psychoanalyt- 
ically oriented theories of personal- 


ity. 


A Paradigm for the Hypnotic Induc- 
tion of Psychopathology 


A paradigm for demonstrating 
valid psychopathology must include 
a procedure for separating the 
mechanisms of suggestion from the 
mechanisms of pathogenic psycho- 
dynamics. Although it is doubtful 
that the mechanisms of hypnotic 
suggestion are similar to the mecha- 
nisms of pathogenic psychodynamics, 
clinical experience with hypnosis 
(Eisenbud, 1937; Rosen, 1953) indi- 
cates that hypnotic suggestion can 
Set in motion nonsuggested patho- 
genic psychodynamics and observable 
psychopathology. Thus, hypnotic 
Suggestion should be used only to 
induce a process that, under certain 
Specifiable conditions, is theoretically 
capable of producing pathogenic 
Psychodynamics and psychopathol- 
ogy. The hypnotically induced 
Process defines the genotype, and the 
behavioral outcome defines the 
Phenotype. The genotype is defined 
Operationally by the statements in 
the hypnotic suggestions; the pheno- 
ane is defined operationally by a 

€scription of the subject’s overt be- 

Pe The description of the 
8 enotype is considered to be opera- 
‘onally valid clinical psychopathol- 
ic only if it satisfies the defining 
Titeria of a given classification of 
Psychopathology. In this way, the 
hvestigator can operationally tie 
dyas the genotype, or psycho. 
a Te that produces the observed 
i chopathology instead of having to 

1€ upon the uncertainties of clinical 
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inference in regard to natural psycho- 
pathology. 

The production of operationally 
valid clinical psychopathology by a 
hypnotically induced process permits 
the inference that the genotype is 
adequate, which in turn is supporting 
evidence for the theory from which 
the genotype is derived. If the geno- 
type does not produce psychopathol- 
ogy, there are two interpretive 
alternatives available: the genotype 
is inadequate and the theory from 
which it is derived is not supported, 
or the conditions of the experiment 
were unfavorable for an adequate 
test of the theory. 

The foregoing considerations sug- 
gest four principles, or criteria, that 
should guide research in this area. 
First, the induced process must in no 
way include cues as to how the ex- 
perimenter expects the subject to 
respond in any other respect. Orne 
(1959) has demonstrated convinc- 
ingly the sensitivity of hypnotized 
subjects to the expectations of the 
experimenter and the “demand” 
characteristics of the experimental 
design. Second, the induced process 
must produce other processes and 
behavior; that is, it must be re- 
sponse-producing. Third, some of 
these responses must satisfy the de- 
fining criteria for inclusion in some 
classification of psychopathology. 
Finally, as Orne (1959) suggests, some 
of the subjects must be asked by a 
co-experimenter, unknown to the ex- 
perimenter, to fake hypnosis in order 
to determine the demand character- 
istics of the research. 

Review or RELEVANT RESEARCH 

Research in the area of hypnoti- 
cally induced psychopathology falls 
into three categories. 


Direct Suggestion : 
In studies of this type, suggestion 
is used to produce a given response 
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which is considered to be clinically 
meaningful. By suggestion the ex- 
perimenter reproduces in the sub- 
ject a specific mood, attitude, affect, 
or symptom. Although most of this 
research has been reviewed elsewhere 
(Weitzenhoffer, 1953), a recent in- 
vestigation by Levitt, den Breeijen, 
and Persky (1960) will be presented 
and discussed in detail because it is a 
particularly good example of the in- 
herent defects in this popular ap- 
proach. The procedure is straight- 
forward: the subject is made to feel 
anxious by listening to a taped pres- 
entation of a series of somewhat 
repetitious phrases of increasing emo- 
tional intensity containing a variety 
of synonyms for the emotions of 
anxiety and fear. 

A deliberate effort was made to 
produce anxiety in "pure" form be- 
cause, under natural conditions, 
there is usually an admixture of 
anxiety, depression, hostility, etc. 
Their attempt to produce anxiety in 
pure form is, therefore, an interfer- 
ence with the idiosyncratic phenotype 
and entirely destroys its clinical sig- 
nificance. Due to its covert intra- 
psychic origins, anxiety is not experi- 
enced in the same way by everyone, 
nor is its presence always detected 
and identified. Moreover, their 
emphasis upon such words as fear,“ 
"dread," ^ "apprehension, and 
"panic" may well be reproducing 
responses to external threat rather 
than generating responses to an un- 
known internal threat, which is a 
distinction often used to differentiate 
anxiety from fear. The affect of anx- 
iety is even more complicated than 
they have observed because it can be 
managed in different ways. It may 
be managed defensively by hostile 
or depressive reactions, projected as 
in a phobia, or converted into somat- 
ic processes. A study which ignores 
the personal equation in the hypnotic 
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uction of psychopatho 
should be designated as an exp 
mental analogue of the clinical] 
havior in question. It is scientifici 
legitimate to endeavor to abstr 
and to purify emotions as they. 
done, but these, by definition, 
not clinical phenomena; it is € 
trolled, laboratory research 
purposely creates conditions to eli 
nate the clinical “taint” of the d 
In terms of the paradigm, ~ 
major shortcoming of direct sug 
tion is the identity between the gel 
type and phenotype. The subj 
merely carries out the suggest 
that are given to him; ti.e instructi 
specify the behavior. This also 
that direct suggestion is not respon 
producing in the sense that Of 
processes are set in motion which: 
to the behavioral outcome. In OF 
to satisfy the paradigm, a p 
must be induced that has the cap 
to trigger off a chain of events 
eventuates in psychopathology. 
nature of the genotype and the cone 
tions under which it is induced W 
reflect some theory about personali 
and psychopathology. In this se 
the paradigm is a procedure for 
ing theories. Direct suggestion te 
nothing but itself. P 


The Induction of Artificial Co fi 


In studies of this type (Bobbi 
1958; Counts & Mensh, 1950; Eri 
son, 1944; Huston, Shakow, & m 
son, 1934; Luria, 1932), the subjec 
provided with a paramnesia re 
ing a situation to which he has 
tressing emotional reaction, $ 
hostility or remorse. In one V 
another, the subject is usually 
that he will not remember à 
about the experience posthy] 
cally, but, nevertheless, it V 
disturbing to him. Although 
duced experiences are intended ! 
perceived as real rather th 
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trived, the paradigm is not satisfied: 
the subject is told that he will not re- 
call the paramnesia or that he will 
recall it to a certain degree; further- 
more, he is told that the paramnesia 
will be a source of posthypnotic dis- 
turbance. The design of these studies 
is also incomplete because of the lack 
of control subjects who are asked to 
fake hypnosis, and the significance of 
the results is vitiated by the relatively 
weak disturbances that were pro- 
duced. 

Wohlberg (1947) reported a pro- 
cedure which seems to approach 
closely the paradigm. Instead of 
implanting a paramnesia, he sug- 
gested an impulse that would pro- 
duce conflict in the waking state. 
His instructions were as follows: 

When you awaken you will find next to you 
à bar of chocolate. You will have a desire to 
eat the chocolate that will be so intense that it 
will be impossible to resist the craving. At the 
Same time you will feel that the chocolate does 
not belong to you and that to eat it would be 
very wrong and very bad. You will have no 
memory of these suggestions when you 
awaken, but you will, nevertheless, react to 
them (p. 337). 


The distinctive aspect of his in- 
structions is the posthypnotic sug- 
gestion of an overwhelming impulse 
which is rendered anxiety-producing 
by pitting it against conscience. 

Although his subjects were in- 
Structed to perceive the induced im- 
Pulse in terms of conscience, they 
Were not instructed to develop symp- 
toms, Accordingly, it is of great in- 
terest that the procedure spontane- 
9usly produced both somatic and 
Psychological reactions, which in- 
cluded such marked symptoms as diz- 
ziness, tachycardia, and a negative 

allucination. Since his procedure 
@pproximates closely the paradigm, 
the Posthypnotic psychopathology 
May very well be a valid clinical phe- 
nomenon. If he had used the proper 
control subjects, a more positive 


347 


statement could be made. In order 
for his instructions to be perfect, the 
subjects should not have been told 
how to perceive the impulse nor 
should he have suggested an amnesia. 
The impulse should spontaneously 
generate all subjects’ reactions. 

An investigation by Reyher (1961) 
also approaches the paradigm. Un- 
der deep hypnosis the subjects were 
given a hallucinatory experience that 
generated intense feelings of hositility 
toward a given individual. The in- 
structions were as follows: 


Now listen carefully. After I awaken you, 


experience 

classes of words are mentioned] will stir 
overwhelming feelings of hate. If these f 
ings break into you will 
realize that it is the person who owns these 
papers (which are within arm's reach] that you 
hate, and you will have an overwhelming urge 
to tear them up. 


Posthypnotic conflict was created 
by presenting trachistoscopically crit- 
ical and neutral pairs of words until 
one word of each pair was recognized. 
Ideally, the instructions should not 
have included such an ambiguous 
word as if,“ nor should an amnesia 
have been produced, even though 
the conflict-producing impulse was 
to be experienced and acted upon 
psothypnotically. Nevertheless, the 
procedure produced much psycho- 
pathology. The recognition of con- 
flict words produced such reactions as 
urticaria, tachycardia, gastric dis- 
tress, headache, flushing, sweating, 
tics, tremors, and such psychological 
reactions as anxiety, apprehension, 
dissociation, and derivatives. One of 
the most important findings was à 
correlation of .74 between the degree 
repression of the induced conflict and 
the proportion of somatic complaints. 

Other than the use of the ambigu- 
ous word if in the hypnotic instruc- 
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tions and the suggested amnesia, 
this study satisfies the paradigm. The 
induced hostility contained no clues 
in relation to the occurrences of 
psychopathology, many symptoms 
were produced and proper control 
subjects did not report symptoms. 
Accordingly, it is reasonable to con- 
clude that the psychopathology was 
genuine. Spontaneous symptomatic 
reactions to hypnotically induced 
processes have been summarized by 
Weitzenhoffer (1953). Although 
these case reports of idiosyncratic 
reactions to hypnotic procedures are 
illuminating, they do not lend them- 
selves to laboratory investigation 
because of their unreliable and un- 
controlled nature. 


The Activation of Natural Conflicts 


Two studies fall into this category. 
Gordon (1959) instructed his sub- 
jects to bring to mind episodes in- 
volving conflict with parents. The 
subjects were given differing degrees 
of posthypnotic awareness of the 
episodes. No symptoms were re- 
ported. Although this approach 
gains clinical significance by permit- 
ting the subject to dwell upon his own 
emotionalized experiences, the in- 
vestigation is difficult to interpet be- 
cause of the posthypnotic suggestion 
to achieve a given degree of aware- 
ness.  Utilizing the subject's own 
conflicts is a promising method for 
producing psychopathology and 
eliminates almost entirely the criti- 
cism of artificiality, provided that the 
subject is not instructed how to re- 
act posthypnotically. Nevertheless, 
it must be shown that such reliving 
of past experiences is anxiety-pro- 
ducing. Since no symptoms were re- 
ported, the conflicts were probably 
not intense enough to generate symp- 
toms or other phenotypic mani- 
festations of psychopathology. 
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An investigation by Reyher and 
Shoemaker (1961) is also pertinent. 
TAT cards were utilized as stimuli 
for producing age regressions and the 
reliving of important emotionalized 
experiences. Ten TAT cards were 
selected randomly to be con- 
flictual or neutral for four subjects 
who were capable of deep hypnosis. 

In order to create a conflict to each 
of five cards, hypnotized subjects 
were told, as they looked at each 
card, that disturbing emotions would 
be aroused. The subjects were then 
regressed to a time when these emo- 
tions were difficult to manage. In 
order to create nonconflictual or 
neutral reactions to the other five 
cards, the instructions were the same 
as above except that the emotions 
were nondisturbing. A posthypnotic 
amnesia was suggested and, in addi- 
tion, the subject was told that the 
cards would stir up the same feelings 
as before, and that he would reveal 
them directly or indirectly in the 
stories that he would be asked to tell. 
In the waking state, the subject was 
given the same cards, by another 
experimenter, with standard in- 
structions. 

Although no symptoms were re- 
ported by the subjects marked differ- 
ences were observed between the con- 
tent of the hypnotic reactions an 
the waking stories. The conflict- 
cards were generally associate 
with more alterations than were the 
neutral-cards; however, on some occa” 
sions the latter also were associate 
with marked differences. These differ- 
ences for both kinds of cards almost 
always reflected unresolved conflicts 
and helped guide psychotherapy: 

Unfortunately the paradigm had 
not been formulated before this in- 
vestigation was carried out. The in- 
duced processes were not kept dis- 
tinct from suggested phenotypic be 
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havior, as the instructions do not 
permit the induced process to gen- 
erate spontaneously all of the sub- 
ject's behavior: the subject is in- 
structed to tell a story related di- 
rectly or indirectly to the induced 
process. Since, in broad terms, the 
subject is told how to respond, it is 
impossible to determine what re- 
sponses were generated by the in- 
duced process and what responses 
were a direct reflection of the hypnot- 
ic suggestion. In order to satisfy 
the paradigm, the instructions should 
state that the posthypnotic adminis- 
tration of each TAT card will stir up 
the same thoughts and feelings as it 
did during the hypnosis and that 
these reactions will become over- 
whelmingly intense. 

Subsequent experience has shown 
that the indirect versus direct option 
can be dropped, because even neurot- 
ic subjects are not easily over- 
whelmed by hypnotically induced 
conflict. 'The subject's defensive 
organization usually does a good job 
in regulating hypnotically induced 
stress; nevertheless, the experimenter 
must be constantly alert for signs of 
à serious breakdown in ego functions 
When the subject is experiencing 
distress. 

Theabsence of overt psychopathol- 
gy may be attributed to the fact 
Ex the impulse was not suggested to 
be overwhelming and that the sub- 
TE was given an option of how to 
; EN Despite these inadequacies 
n design for the production of psycho- 
lon, the hypnotic reactions 
ie produced the most alterations in 
RE rating stories were congruent 
1 . areas of conflict described 
1 5 ler psychodiagnostic impres- 
11 ut which had not yet come up 
in psychotherapy. This observation 

cates that there were significant 
Psychodynamic reactions involved 
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and that this procedure might be very 
productive of psychopathology, if 
utilized properly. 

There is reason to believe that the 
conflict-producing potential of the 
hypnotic reactions can be intensified. 
It was observed that in subsequent 
psychotherapeutic sessions with the 
subjects, deeper“ aspects of the hyp- 
notic reactions which were markedly 
changed in the waking state often 
could be uncovered by the induction 
of successive dreams about the mate- 
rial "behind" them. By telling the 
subject that he would have a dream 
about the emotions and thoughts 
behind his hypnotic experience, the 
material often became progressively 
more clearly represented until an 
abreaction of emotionally charged 
experiences took place. This ma- 
terial is valuable from the point of 
view of psychotherapy and, for re- 
search purposes, may be used to pro- 
duce anxiety and psychopathology in 
the posthypnotic state. 

These procedures would seem to 
have the greatest potential for creat- 
ing psychopathology, but they also 
have the disadvantage that they 
should be restricted to subjects who 
are in psychotherapy or those who are 
waiting to begin. The experimenter 
must be in a position to help the sub- 
ject work out adverse reactions if 
they should occur; otherwise, he 
places himself in an untenable ethi- 
cal and professional position. 


DISCUSSION 


All previous research in the hyp- 
notic induction of psychopathology in 
some way has interfered with the 
spontaneous reactions of the subjects 
by instructing them how to react to 
the induced processes; consequently, 
the interpretative significance of the 
subjects’ reactions is reduced in pro- 
portion to the extent of the interfer- 
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ence. If the induced processes have 
no intrinsic capacity for spontane- 
ously producing alterations in be- 
havior—such as distortions, repres- 
sion, psychosomatic reactions, etc.— 
then the induced processes have no 
real clinical significance, and the 
imposed experimental reactions are 
merely hypnotic suggestions to be 
carried out. 
Two methods derived from psycho- 
analytic theory were presented in 
which emotions can be linked with 
anxiety and the production of psycho- 
pathology: one artificial and the 
other natural First, an emotion, 
such as hate, is brought to over- 
whelming intensity by a set of 
appropriate hallucinatory experi- 
ences (paramnesia). Since the intense 
hate would pose a vital threat to the 
subject's security under the cir- 
cumstances of the waking state, it is 
hypothesized that its activation by a 
posthypnotic signal creates the con- 
ditions for conflict, anxiety, and 
psychopathology. The posthypnotic 
intensification of hostility activates 
the subject's traditional defenses 
against hostility of such intensity; 
that is, there is a danger point in the 
intensity of hostility beyond which 
the subject would lose control and, 
thereby, subject himself to the re- 
taliation of the environment. The 
necessary controls and defenses are 
learned early in life and are triggered 
off in the posthypnotic state at the 
time the relevant posthypnotic signal 
is given. The second method is the 
same as the first except that the sub- 
ject's own idiosyncratic conflicts are 
activated by the posthypnotic signal. 
His defenses against anxiety-produc- 
ing processes are pressed beyond 
their usual limits, and anxiety and 
psychopathology are produced. 
The paradigm can be utilized to 
test theories regarding specific kinds 
of psychopathology and almost any 
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alteration in personality. For ex- 
ample, if a state of depression is de- 
sired, there are at least two clinical 
models from which to choose: the 
subject is made to believe that the 
objects and symbols for the grati- 
fication of his important emotional 
needs are no longer available, and 
in the waking state, he is given a 
paramnesia consistent with these 
events; if clinical, reactive depres- 
sion is desired, hostility is induced 
toward loved ones in subjects who, 
on the basis of previous knowledge, 
turn this kind of hostility inwards. 
More directly, it may be possible to 
condition subjects who have a ten- 
dency in this direction to react to 
their own hostility by turning it in- 
wards, and then to produce a param- 
nesia which involves a situation 
that normally would lead to intense 
hostility. The subject is given a post- 
hypnotic signal for this hostility to 
become intense and conscious. Only 
those subjects are retained for study 
who do not achieve awareness of their 
hostility, despite the posthypnotic 
suggestion to do so. 

In regard to the induction of a 
paramnesia or the implantation of an 
impulse that ordinarily is foreign to 
the subjects, there is reason to believe 
that a suggested amnesia for the hyp- 
notic session may be necessary. If an 
amnesia is not suggested, it may be 
that enough fragments of the session 
will be recalled by the subject for 
him to realize that the experimenter 
had implanted something, and the 
growth of subsequent insight into the 
true nature of the experience woul 
render the conflict innocuous. A sus” 
gested amnesia would prevent the 
subject from acquiring insight an 
preserve the conflict. 

This also may be true for the 
activation of the subject's own con 
flicts. The fact that the experimenter 
succeeds in getting a hypnotized sub- 
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ject to become aware of conflictual 
material indicates that repression 
already had started to break down 
and that the subject may find it 
relatively easy to become aware of 
the material in the posthypnotic 
state. The hypnotic uncovering of 
conflictual material in patients un- 
dergoing psychotherapy supports this 
observation. When potent repressed 
material becomes represented in hyp- 
nosis, this indicates that the forces 
maintaining repression have been 
growing progressively weaker. This 
is illustrated by the fact that only 
after many months of intensive 
psychotherapy, including hypno- 
analytic techniques, do the most sig- 
nificant repressions begin to lift. 
They begin to break down because 
the way has been prepared by the 
progressive development of a more 
secure relationship with the psycho- 
therapist and the prior achievement 
of insight into less intense facets of 
basic conflicts. Most psychothera- 
pists who are experienced with hyp- 
noanalytic techniques realize that 
while hypnosis is not an immediate 
and direct route to the uncovering of 
repressed material, it is certainly 
more rapid and more direct than most 
other methods. Once something has 
been uncovered in hypnosis, subse- 
quent insight in the waking state is 
usually attained readily. It may be 
that a posthypnotic amnesia rein- 
forces repressive forces and thereby 
Preserves the capacity of the induced 
E e to produce psychopathol- 

y. 

There is some evidence that the 
relationship between the experi- 
menter and the subject is a significant 
flict in successfully inducing con- 
a 10 In an unpublished study, the 
ilk Or was able to replicate the re- 
501 of an earlier study (Reyher, 

1) which produced somatic and 
Psychological reactions to the post- 
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hypnotic stimulation of hypnotically 
induced conflict. However, an assist- 
ant using the same procedure could 
produce only symptoms of a rela- 
tively mild degree. Other than the 
different experimenters, there was one 
obvious difference in the preparation 
of the subjects that might account for 
this discrepancy. The subjects who 
were used by the assistant were hyp- 
notized and brought to a deep trance 
bysomeoneelse. Theassistant merely 
saw them for one brief session before 
the experimental session in order to 
establish the depth of the trance. In 
contrast, the senior experimenter had 
begun with naive subjects and 
brought them to a deep level of 
hypnosis himself in the course of 
three or four sessions. By the time 
the subjects were ready for the experi- 
mental sessions, they seemed to be 
quite at ease and trusting of the ex- 
perimenter. It is reasonable to 
hypothesize that an unfamiliar ex- 
perimenter would arouse some anx- 
iety and defensive behavior that 
would interfere with the effect of any 
suggestions of a personal nature. 

No matter what is to be induced 
hypnotically, it is wise to present 
the instructions in the passive voice. 
The use of the passive voice reduces 
the possibility that the subject may 
act out a role to please the experi- 
menter. More specifically, the subject 
should not be instructed to carry out 
suggestions but he should be informed 
that he will be acted upon by some- 
thing or that he is going to experience 
something. Not only does the active 
voice promote the expectation that 
the subject should do something, but 
it also enhances volitional, adaptive 
processes which render the hypnotic 
behavior similar to waking behavior. 
Thus, the instructions should mini- 
mize the role of volitional processes 
and maximize the role of nonvoli- 


tional processes. 
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The present paper is limited to one 
of a continuing controversy, 

that of "psychic" phenomena. Within 
this general field, there is the rubric 
“concerned with dice throwing and re- 
lated tests designated as "psycho- 
kinesis" (PK). In everyday language, 
if ESP represents mind to mind," 
PK refers to "mind to matter." 
Thus Rhine and Pratt (1957, p. 7) 
Speak of two main subdivisions of 
Parapsychology: extrasensory precep- 
tion and PK. For them, PK is “The 
direct influence exerted on a physical 
System by a subject without any 
known intermediate physical energy 
or instrumentation" (p. 209). In pre- 
paring this review, more than 200 
publications have been examined, 
which deal with one phase or another 
of PK or its hypothetical relationship 
toESP. And it is not possible because 
of limitations in space to review in 
detail every report. Every attempt 
has been made to insure complete 
Coverage of the PK data reports, the 
first of which appeared in 1942. Inan 
area saturated with controversy, it 
should be expected that a review con- 
fined to PK will be considered by 
wd to be too restricted. For some 
olding this view, the definitive proof 
of ESP has been obtained, whereas 


! Gratefully, acknowledgement is made for 
(Sta 175 "Welton Stanford Fellowship 
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ase of a larger study of the process of contro- 


_ Versy in scientific endeavors. 


PK is still controversial. Some others 
would hold that definitive proof of a 
“qualitative” or "spontaneous" na- 
ture existed prior to the establish- 
ment of the Duke Parapsychology 
Laboratory. 

There are a number of reasons to 
justify a review devoted specifically 
to PK. After study of the published 
data and discussions with interested 
parties here and abroad, it seems 
clear that all of the issues which have 
been raised with respect to ESP also 
appear in connection with PK. The 
topic constitutes a unit which can be 
considered within the limits of a 
single publication. It is an area with 
which the academic psychologist is 
generally unfamiliar. Although criti- 
cism of some of the PK reports, espe- 
cially the earlier dice tests, have ap- 
peared, there has appeared no assess- 
ment of psychokinesis as a whole. 

PK constitutes a controversy 
within a controversy, with different 
positions taken concerning the reality 
of one or another aspect of psychic 
research. On the one hand, there is 
the assumption by Rhine as well as 
some other believers that PK and 
ESP are related. For Rhine (1944a), 
“The proof that the mind is extra- 
physical in nature does not rest, how- 
ever, on the ESP work alone; it has 
received powerful confirmation from 
many of the PK researches which 
have especially borne upon this issue 
(p. 250). Rhine (1947) states that: 
“The most revealing fact about 
is its close tie-up with ESP [p. 
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120]... PK implies ESP, and ESP 
implies PK" (p. 129). Murphy 
(1952a) noted that “The law of the 
decline curve [reduction in scoring 
rate for designated targets] which we 
quoted earlier in relation to ESP 
holds in PK, as Dr. Rhine had shown 
earlier, just as it does in ESP"' (p. 57). 
In a review of the field, no mention 
was made of PK (Murphy, 1958) but 
most recently it was concluded 
(Murphy & Dale, 1961) that “the 
thoughtful modern reader can no 
longer slam the door on psycho- 
kinesis" (p. 182). Thouless and 
Weisner (1946) extended the use of 
the term “psi” so as to include PK. 
For Heywood (1959) the convincing 
evidence comes from McConnell, 
Thouless, and Fisk (cf. Table 2). 
Eccles (1951), well known in neuro- 
physiology, has proposed a cortical 
theory incorporating ESP and PK 
into brain physiology. The PK stud- 
ies were interpreted as evidence for 
the reality of a "force" that was pre- 
sumed to have made it possible for a 
famous Italian medium in 1908 to 
cause a stool to rise (Ducasse, 1951). 
On the other hand, Flew (1953) 
"must confess to almost invincible 
incredulity" (p. 104). Eysenck 
(1958), who accepts ESP, stated that 
with respect to Rhine's PK data “we 
should be cautious in accepting these 
results until they have been dupli- 
cated successfully elsewhere" (p. 
140). In what has been called by 
McConnell (1954) the most impor- 
tant book on parapsychology since 
"ESP-60" (Rhine, Pratt, Smith, 
Stuart, & Greenwood, 1940), Soal 
speaks of PK as an “alleged” effect 
(Soal & Bateman, 1954, p. 360). 
Elsewhere, Soal (1948) noted that 
“Dr. Rhine's book [Reach of the Mind] 
certainly merits 'remarkable' in more 
senses than one" (p. 185). McConnell 
(1948) commented that Soal “ac- 
cepts telepathy but not PK. . I do 
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not understand the type of | 
which is bold enough to defy e 
science by accepting telepa 
yet is so timid as to deny 
kinesis when the evidence 
latter is rather better than 
former" (pp. 242 f.). West (19 
judged that “Without call 
experimenters liars, the case 
does not seem to be challeng 
is probably even more clear-cut tl 
the case for ESP itself“ (p. 
Some years later, however, 
(1954b) judged that "Fur 
search is needed before it can 
cepted as an established conci 
par with ESP. ... There is nogh 
definite to connect PK with the vt 
palpable forces associated with phi 
cal mediums” (p. 115). These diff 
ences among those committed 

field as well as the agnosticis 
many academic psychologists 
cate a need for a thorough revii 
the area. i 


PK Hyporuesis 


There would appear to be no 
tion that the hypothesis, ho 
vaguely formulated, behind the 
Parapsychological Laborator 
throwing tests was psychologic 
nature. With respect to the | 
Dice tests (cf. Table 1), Rhine (19% 
concluded that Now in 1944, 
years later, the position to whicl 
is driven by the results of these d 
throwing studies is this: There 
direct psychical effect on the fall à 
dice . . . and may be termed ps 
kinesis, or PK” (p. 190 f.). Elsew! 
he says ‘‘Psychokinesis is accord 
produced by no mere blind an 
poseless force . . . PK reacts wit 
physical object according to 
gent design and direction" ( 
1947, p. 117; also cf. Rhine, 

In its simplest terms, the 
mental design of the PK tes 
based on wishing for specified 


PSYCHOKINESIS (PK) 


in throwing a given number of dice. 
The significance of the results was a 
function of the number of target 
"hits" beyond "chance." Speaking 
of the 18 studies upon which the case 
for PK was based, it is stated that 
(Rhine & Pratt, 1957) all "involved 
the same essential operation, the sub- 
ject's [S's] conscious effort to influ- 
ence the fall of dice so as to make a 
specified face or combination of faces 
turn up" (p. 60). This ostensibly 
implies that PK is indicated by ob- 
taining target hits. But a postmor- 
tem evaluation of these data sug- 
gested a second criterion: an “extra- 
chance" decline in the number of 
hits for specified targets (cf. Rhine, 
1957; Rhine & Humphrey, 1944a; 
Rhine & Pratt, 1957). For McCon- 
nell (McConnell, Snowdon, & Powell, 
1955), „the evidence for psychokinesis 
rests on two statistical effects. These 
are the total deviation from chance 
expectation of wished-for die faces, 
and the occurrence of extra-chance 
declines in scoring rates for these 
faces” (p. 269). Although strictly 
speaking hit-score and decline in 
scoring are indices, the PK hypo- 
thesis, for the present purpose, will 
be identified with these two effects. 

With respect to experimental de- 
Sign, occasionally, an independent 
variable was employed. Thus if the 
variable is to be wishing“ for given 
die faces to appear (ie., “target 
hit"), an adequate experimental test 
is most readily devised by comparing 
the scores obtained in wishing versus 
nonwishing sequences. If, however, 
there is no independent variable, the 
eXperimental design rests upon a 
Significant statistical departure from 
the Probability Model. There are at 
least two important considerations 
here, First, there is the necessity of 
adequately controlling all known 
actors. Secondly, there is the as- 
Sumption that the unknown (but 
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knowable) factors are randomly dis- 
tributed. Then, a significant devia- 
tion from chance expectancy is taken 
as evidence for extranormal phe- 
nomena. There are obvious experi- 
mental dangers in such dependency 
upon a Probability Model. The diffi- 
culty in applying an operational defi- 
nition based on it to paranormal is 
not to be minimized (cf. Boring, 
1955). This is undoubtedly another 
basis for the controversy with respect 
to ESP as well as PK. 

In addition to the dice throwing 
tests (Tables 1 and 2) two other cate- 
gories of data are reported in support 
of the PK hypothesis: e.g., throwing 
of objects other than dice (Table 3) 
and finally tests in which thrown (or 
released) objects are intended to rest 
in given target areas (Table 4). The 
present analysis is thus divided into 
four sections: (a) the early pioneer 
and (b) later phase in dice i 
tests; (c) wishing with objects other 
than dice, and (d) placement series in 
which the released objects are re- 
quired to land in a specified area of 
the target table. Certain other re- 
ports have been omitted from con- 
sideration here: e.g. wishing with 
plants (cf. Loehr, 1959; Vasse, 1950; 
Vasse & Vasse, 1948) and paramecia 
(Richmond, 1952), Similarly, a 
number of reports with animals (Osis, _ 
1952) have been excluded. It would 
be a fruitless exercise to determine 
whether such reports belong in PK or 
ESP. Since, from the published 
record, these reports are not pre- 
sented as crucial to the case for PK, 
detailed treatment of these data is 
not pertinent here. 

Some commonly employed terms 
should also be noted. A hit would be 
the appearance of a given die face 
when it had been designated as tar- 
get. With more than one die per 
throw (e. g., 2-96), the score would be 
the total number of dice showing the 
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wished-for target face. H(igh) targets 
are combined face readings of eight 
or higher with two dice per throw or 
readings on a pair of successive 
throws with a single die (Hilton, 
Baer, & Rhine, 1943). Conversely, 
Lows would be scores of 2 to 6 inclu- 
sive. 7's“ isa combined score with a 
pair of dice totaling 7 in any combi- 
nation, such as 1 and 6, 3 and 4. In 
Reeves and Rhine (1945) however, 
score was for ''doubles," i.e., both 
dice had to appear on six-face to be 
considered a hit (Table 2). Targets 
given as 1-6 indicate that each face 
served as target, although not neces- 
sarily for an equal proportion of the 
total number of trials. A run consists 
of 24 die throws, in 24 single die 
throws or one throw of 24 dice. Thus 
a single casting of 96 dice would be 
tallied as 4 runs. Throughout, there 
will be reference to "'declines" in 
which significance is attached to a 
decline in the rate of scoring for desig- 
nated targets. This concept will be 
treated specifically in the section on 
Declines. 


Dice TEsts 
Early Dice Tests* 


The case made for PK is based on 
the first 21 reports in Table 1, data 


It is to be noted that there is a gap of 6-9 
years between the collection of these data 
(1934-37) and their first publication in 1943. 
One can appreciate the delay in publishing 
these reports "to avoid any publicity about 
our PK work. At the time, 1934-37, the storm 
of controversy was rising over the ESP re- 
search reported in 1934, and we thought it 
best to withhold the PK work until that sub- 
sided" (Rhine, 1947, p. 104). Concerning the 
controversy over the early card-guessing re- 
ports, see "The ESP Symposium at the 
A.P.A.” (Anonymous, 1938), Kennedy’s 
(1939) critique, “ESP-60” (Rhine et al., 
1940), and the Ciba symposium (Wolsten- 
holme & Millar, 1956). It is a speculative 
fantasy as to what would have been the fate 
of the early dice reports if they had been sub- 
mitted for publication to Rhine's own journal 
during 1939-41. During this time, G. Murphy 
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which were collected during 1934-37 
and published in the Journal of Para- 
psychology 1943-46. Two reports by 
Nicol and Carington (1947) and 
Nash (1944) belong in this period 
since they were carried out, respec- 
tively, in 1934-36 and 1940, inde- 
pendently of Rhine. Excluding the 
negative McDougall Series, the first 
19 studies in Table 1 constitute a very 
impressive score. There was a total 
of +6,515 target hits (i.e., appear- 
ance of wished-for designated target 
faces). The total number of runs 
given in the published reports exceeds 
that given here, since some unwit- 
nessed runs and control series were 
excluded. But these constitute minor 
discrepancies between the present 
tabulation and that given by Rhine 
and Humphrey (1944a, p. 28) which 
also included some data from the 
Later Dice tests. An overall sum- 
mary of the test conditions and the 
main findings are given in Table 1. 

In the simplest terms, the critical 
question is whether this outcome was 
the result of wishing or any other 
psychological (and/or psychic) vari- 
able. And was the subsequently re- 
ported decline in scoring also psycho- 
logical in origin? In terms of conven- 
tional scientific practice, an answer 
to these questions is in part a func- 
tion of the conditions under which 
the results were obtained. These con- 
siderations will be evaluated as they 
are relevant: some with respect t9 
the present material, i.e., Early Dice 


and B. F. Riess served as editors and were SUP” 
ported by an advisory committee of eminent 
psychologists. The editors had accepte 
Rhine's invitation provided that (Murphy 5 
Riess, 1939) "Emphasis is to be on consiste?" 
technical reporting with very detailed ac 
countsof experimental and statistical metho 3 
(p. 1). The exchanges during these 3 yea" 
between the Advisory Board, authors 110 
Rhine, and the latter's conception O t 
responsibilities of an Advisory Board to # 
scientific journal are illuminating. 
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TABLE 1 
EanLY Dice Tests 


Author References E Ss Throw | Target ce | Runs | „Die | Devia- 

Cop TuS 2 E: throws | tion CR 

ine & Rhine (1943) 5 
Reeves & Rhine (1943) | Reeves So | On 562 300 | 7.40 
Gibson & Rhine (1943) | Gibson 1 (w) c 6 492 290 | 7.65 
Hilton, Baer & Rhine | (1943) | H&B 2) |hm | H EH ui v3 
Hilton & Rhine (1943) Hilton 3 c H br he 4.55 
Gibson, Gibson, & (1943) 3 eM | 6 110 & | xn 

Rhine (1943) Gibson 1(w) 
Rhine & Humphre i um Hr KOLI T 
Gibson, Gan (1943) McDougall | 9 h 1-6 269 — ug 

Rhine (1944) Gibson 
Rhine & Humphrey | (1944)° | Frick 2mm M 25 4 9| 4.13 
Price & Rhine 12140 Woodruff (109 M 6 2 "60 EIE 
Humphrey & Rhine" | (1945) Wooden i ©) M $ 2 i 126 T $5 
Reeves & Rhine (1945), | Reeves. [09 h 6s | 2 369 28 | 2°36 
Rhine 4 Humphrey opa ERE 7 hM | 7s 2 207 66 | 3.57 
Rhine & Humphrey* 9480 p gue ó 9 5.453 281 13. 
Averill e | (19499) 2 b 6 96 | "240 "A sa 
Rhine, H., & Averill* | (1945)P 4 b 6 96 680 216 | 47.59 
Rita?) ates 1 E b 6 96 | 480 =|. 
Nel & Rhine | (1942) 1(E) oM | 1-6 | 1 70 75 | 4.78 
Nicol & Carington | (1947), Nicol 3 c 16 | 1 139,396 | — i 

(1944) 113 0 16 | 1,6 8.16 — 
2 (Es) Control] 6 18,44 | — 


Note.—Unless otherwise indicated, author is E; w under Ss is wife of E. 
m (mechanical release), 
Deviation is number of (target) hits beyond chance expectancy. Blank 


Dice throw is by b (box release 
E € (cup), h 
S die throws, regardless of a Bee ney 
dicates C(ritical) R(atio) is insignificant. 
b uke Report. 
> Minor Report. 
E, as S, working alone. 


tests; others in later sections of this 
analysis. 


Systematic Procedures 


Ignoring for the moment, the basic 
consideration of experimental design, 
1 is clear that these tests were largely 
* and off the cuff. Varia- 

ility of test conditions was a com- 
mon occurrence. Whatever other 
5 result from such prac- 
s replication of test conditions is 
in possible. When Reeves was work- 
d solo at home, ‘‘For the most part, 
on inm number of runs [were] made 
R igh and low dice each day" 
Be & Rhine, 1943, p. 80). In 
ir Baer, and Rhine (1943), Ss 
of di permitted to throw one or a pair 
10430 and, also in Hilton and Rhine 
an h to choose a pair of dice from 
B $ three pairs of different sizes. 
eu e first of these two Hilton re- 
VU it was also reported that there 
no optional stopping" since the 


and M (machine-tumbled). Each run total 


tests were terminated by the close of 
the semester. In the Gibson series, 
there were no established routines, 
nor was the order of target faces re- 
ported, although "at one time or an- 
other, they threw for each of the 6 
faces" (Gibson, Gibson, & Rhine, 
1943, p. 229). In this test, some runs 
were obtained with the experimenter 
(E) and some other individuals as Ss. 
In the later study (Gibson, Gibson, & 
Rhine, 1944) in addition to the two 
Gibsons who served as Ss, 429 miscel- 
laneous runs were obtained with 
other Ss. In the first Frick series, 
ostensibly all die faces were used as 
targets but the six-face was chosen 
for about 7596 of the throws, with 
complete "freedom allowed S" (Rhine 
& Humphrey, 1944c, p. 142). In the 
Smith solo, there were no established 
routines, one-sixth of the runs were 
obtained with another S and optional 


ing occurred. Faces 1, 2, and 4 


stopp d 
served as targets for 97 runs (Rhine 
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1944b). In the Humphrey and Rhine 
(1945) all Ss preferred and used ex- 
clusively the six-face as target. In 
general, there was considerable varia- 
tion in target designation within the 
same test. Ss would be free to choose 
any desired face as target. When 
slumps would occur, E might suggest 
rate or manner of throwing be varied. 
For the most part, the six-face was 
the predominantly chosen target. In 
general, informality was the rule; a 
well designed, rigorously executed 
test was the exception. 


Recording Errors 


Little or no effort was made to in- 

sure accuracy in recording the ob- 
tained scores. It might be unfair to 
have required in these early tests 
completely objective records, e.g., 
photographic recording of all die 
faces on all throws. Certainly, it 
would not have been unreasonable to 
require two independent recorders 
for all die throws (cf. Dale, 1946; 
Dale & Woodruff, 1947). On the 
other hand, records made by a single 
E, aware of target designations, and 
especially those instances in which 
unwitnessed E was also S, give little 
confidence in the data. As noted in 
Table 1, four tests were entirely solo 
efforts. The hazard in the solo efforts 
of recording what is anticipated 
(wished-for target) is most critical. 
This difficulty is not minimized if 
only a small number of dice are cast; 
e.g., throwing one die or a pair of dice. 
In this circumstance it has been 
noted that “if there are no hits at a 
given throw, the recorder may tend 
to make no entry.“ 

Questions of accuracy of recording 
are equally serious in those cases in 
which many dice were involved. As 
indicated in Table 1, in the first Frick 


D. Parsons, 
August 1961. 
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series there were 24 dice per throw; in 
three other Duke exploratory series 
there were 96 dice to be counted on 
each box release. In the alcohol series 
(Averill & Rhine, 1945) each author 
ingested a cocktail concocted of 100 
cc of gin and an equal amount of a 
soft drink, during which one S “lost” 
the drink during the experiment and 
JBR being able to retain it only until 
the tests had been completed. A 
third person present functioned as E 
to record hits. Averill and Rhine 
served again as Ss 2 days later in the 
caffeine study (Rhine, Humphrey, & 
Averill, 1945), the results of which 
are considered such as ‘‘to render any 
future discussion of the chance hy- 
pothesis unnecessary” (p. 87). It was 
noted Averill and Rhine ''did not feel 
as alert as usual" (p. 81) and again 
the third person served as E. A com- 
bined or changing role in the experi- 
ment may interfere with reliable 
recording. With Woodruff as 3, 
Price tried to distract him in friendly 
rivalry" (Price & Rhine, 1944, p. 180) 
and also record the hits. In some 
tests, two individuals served, respec- 
tively, in roles of recorder and ob- 
server; yet frequently, these individ- 
uals also served as Ss (e. g., in Table 1, 
the three Gibson series, and the First 
Frick report). 


Pooling of Data 


The practice of pooling of data ob- 
tained under different conditions, 
statistically or experimentally, 1s not 
a defensible procedure. This was 
manifestly apparent in Rhine an 
Rhine (1943). There is a set of data 
in addition to that reported in Table 
1, for Mrs. Reeves working at home 
alone. With low dice as target, on 
435 runs the average score was 9. 
(5.00 chance expectations) or 
hits, and a reported, barely sign 
cant, CR of 2.58 (Reeves & Rhine 
1943, p. 81). The authors, combining 
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both sets of data, report an excess of 
H and L hits of 382 (CR —7.34). In 
Hilton, Baer, and Rhine (1943), two 
graduating seniors interchanged as & 
and E and were free to cast one or 
two of three pairs of different sized 
dice. Of the total of 484 runs, three- 
fourths were hand-thrown and the 
scores on the remainder (mechani- 
cally released) were a little better. 
Although it was stated that optional 
stopping“ was not involved, the tests 
were concluded when the Ss were 
graduated. The total deviation of 
+130 hits (484 runs) was a pooled 
score of trials with different sized 
dice, thrown in combinations of one 
and two dice, manually and mechani- 
cally. This total was reported signifi- 
cantly better than a nonwishing con- 
trol series of 128 runs (+4 hits) with 
medium sized dice alone. As Nicol 
has emphasized, in the Ciba sym- 
posium, the difference in score be- 
tween the control series and that in 
the main series obtained with the me- 
dium sized dice alone is statistically 
insignificant, “a modest 0.27" (Wol- 
stenholme & Millar, 1956, p. 34). 
More important, no distribution is 
given of the proportion of single die 
and two dice throws, mechanically 
released or manually thrown, sizes of 
dice, for each (or both) of the Ss. 
Distributions of scores—as High 
targets (8's or higher) or otherwise— 
are totally lacking. 

The data in Table 1 for Hilton and 
Rhine (1943) are pooled results for 

ilton, his sister, and his brother-in- 
law. It is admitted tht In terms of 
Controls, this research is less com- 
plete than is commonly the case" (p. 
cd The data in Table 1 for the 
Mon Large Cup Series were con- 
5 uted 14% by Gibson, 72% con- 
ributed by his wife, and the re- 
Erde by some 12 friends (Gibson 

al, 1943). The social factor test 
Was a comparison of scores by Wood- 
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ruff working alone and then throwing 
on the following day “heckled” by 
Price with an additional 20 runs ob- 
tained some 2 weeks later (Price & 
Rhine, 1944). The Humphrey and 
Rhine (1945) data are pooled results 
of two sets of 63 runs each, each of 
which was obtained with a pair of 
different sized dice. 


Control Tests 


The purpose of these tests was to 
establish the role of wishing for speci- 
fied die faces (targets) to appear 
when the dice are cast. In Hilton, 
Baer, and Rhine (1943), for the con- 
trol test with medium sized dice (not 
included in Table 1), Ss were in- 
formed that this was to test the laws 
of chance; this was done to prevent 
their attempting to influence the 
dice" (p. 175). In this series of 128 
runs, the average fell to 5.03, giving 
an overall chance deviation of +4 
hits. However, if one compares this 
control series with those runs of the 
main series made with the medium 
sized dice, the difference is insignifi- 
cant, as Nicol has noted in the Ciba 
symposium (p. 34). 

The study of the "social factor" 
compared results obtained by Wood- 
ruff working (and recording) alone, 
with the results obtained when he was 
distracted by Price (who acted as 
recorder) and those occurring in the 
normal (unheckled) situation with 
neutral observers present (Price & 
Rhine, 1944) obtained the preceding 
week. The last data, of course, fail to 
constitute an acceptable postex peri- 
mental control test. The Minor 
Studies” involving pharmacological 
variables violate the simplest require- 
ments of control studies, lacking pre- 
and postcontrol tests and "placebo" 
tests (Averill & Rhine, 1945; Rhine 
et al., 1945). Equally inadequate are 
the data obtained with hypnosis; e. g., 
design modifications were introduced 
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during the course of this study, since 
the "experiments did not turn out as 
had been expected" (Rhine, 1946a, 
p. 130). 

The one test constituting a control 
on wishing is the Frick solo with 60 
dice/cup-throw in 1937. Working 
alone, he completed two series the 
first of which is reported in Table 1. 
As indicated there were 4-582 hits in 
the 2,172 runs for the six-face target. 
A second series was now done, in 
exactly the same way, again recording 
6's (to make for consistent recording), 
but wishing part of the time for 1's 
and the remainder for 6's not to ap- 
pear. Under these conditions, for an 
equal number of (2,172) runs, there 
were +576 hits for the six-face, al- 
though S was wishing for 1’s. The 
CR for the excess 6’s was of the same 
order of significance as when first 
wishing for 6's in the first part 
(CR=6.77). These negative results 
evoked the interpretation (Rhine & 
Humphrey, 1945a) that there was 
"no place in Frick's personal philos- 
ophy to accommodate the PK hy- 
pothesis" (p. 215); "it appears that 
Frick must have, as it were, pretty 
completely deceived himself in the 
conduct of Series B. .. he was not 
well unified in his motivational ele- 
ments" (p. 218). Therefore, if there 
were as many hits in the (control) 
Series B as in the (experimental) 
Series A, thus proving the dice were 
biased, this "would solve any prob- 
lem raised by the first series and bring 
the investigator the maximum peace 
of mind" (p. 215). The positive re- 
sults of Frick's first report (Rhine & 
Humphrey, 1944c), were not ques- 
tioned on motivational bases. Such 
post hoc rationalization is frequently 
to be noted, even in the Later Dice 
tests. Thus, the negative results ob- 
tained in Schwartz's (Table 2) un- 
witnessed cup thrown experiment, 
planned in correspondence with the 
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Duke Parapsychology Labor 
were suitably rationalized by 
(1946c, 1952) and other ill 
are to be found in the à 
declines and in the long, la 
series reported by Forwald (T. 


Probability Model 


Since, with the exception of 
Frick-Solo Series, adequate 
tests are lacking in this series € 
ports, the case for PK rests u 
obtained scores (target hits) 
deviate markedly from the theore 
expectations predicted by the 
retical Model—in this case, the 
mal probability curve. As 1 
earlier, the reported data of 
relevant reports show an 
+6,515 hits. What does this m 
Although the published descrip 
of the test conditions leaves mu 
be desired, a tabulation disclo 
following distribution of die 
targets: 


Number of Total 
studies runs 

4 2,362 

1 207 

11 12,732 

3 9,816 

Total 19 25,117 


Generally, scoring was recorde 
terms of hits—how many dice ti 
up with the wished-for face. 
common practice was to score 
the hits, that is only those faces Y 
were specified as targets. Fort 
reason, a further analysis i$ not P 
sible in those studies in which ' 
targets were for Highs, Lo 
Sevens. However, some pe 
information is available in tho t 
in which the target was varied 
Faces 1 to 6. 

For these three studies, the 
ing wasreported. In Gibson, 
& Rhine (1943), Faces 2, 3, 
were chosen as targets in 
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7-11% of the throws, Face 1 was 
chosen in 17% of the throws, and 
Faces 5 and 6 served most frequently 
as targets, respectively, 2766 and 
29%. In the 1944 report by the same 
authors (Gibson Machine), each die 
face was chosen as a target with equal 
frequency both by Gibson and his 
wife. A miscellaneous series of 429 
runs is omitted from the present con- 
sideration since nothing is reported 
on target faces for this sample. In 
the First Frick Report, Faces 1-2-3 
were chosen 4.7% and Faces 4-5-6 
95.3% of the runs, with the 6 face 
alone being selected on 75% of all the 
trials (Rhine & Humphrey, 1944c). 

In terms of all 19 studies, Faces 1-3 
served as target for about 13% of all 
runs. In all, six-face and sixes were 
specified as target for about 65% of 
all runs. Of a number of interesting 
considerations, it is self-evident that 
the most elementary requirement 
necessitated the equal representation 
of all six die faces as targets in some 
randomized order and the tabulation 
of all die faces on all trials. There is 
no need to make use of higher mathe- 
matics to conclude that biased dice 
could account for the obtained results. 
It is also important to note in this 
connection that Weldon’s report of a 
(nonwishing) long series of dice throw- 
ing was available years before the 
Duke tests were started. Several 
references are to be found for this 
test. The most common is given by 
Fisher (1938, p. 67) who reported 
that Weldon made a total of 26,306 
throws of 12 dice, scoring the number 
of times 5’s and 6’s appeared. There 
was an excess of (+) 1,378 hits 
(106,602 with chance expectancy 
predicting 105,224 5’s and 6’s). This 
analysis is to be found in all editions 
of Fisher's now classic book which 
was first published in 1925 and was 
acknowledged in the very first report 
by Rhine and Rhine (1943). 
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Some years later, Pratt (1947a) 
reported a nonwishing control series 
in which all die faces were recorded 
by an observer (cf. Table 2). Faces 
1-2 scored negative deviations from 
chance (combined —348 hits), 
Faces 5 and 6 gave positive scores 
(combined = +239 hits), and the scor- 
ing pattern was 6, 5, 4, 3, 2, 1. Pratt 
(1947a) acknowledged that dice with 
excavated spots (used by Gibson, for 
example) favor the higher faces and 
admits that "many PK reports have 
presented results which fit this pat- 
tern" (p. 55) emphasizing, however, 
that “the best answer to the biased- 
dice hypothesis came out of the posi- 
tion effect analysis” (p. 56). And at 
this time Rhine (1947) acknowledged 
that “The problem of faulty dice 
remained .... We recognized that 
we would have to conduct tests in 
such a way that bias in the dice 
would be equalized or controlled in 
some reliable manner. We decided to 
seek perfect experiments rather than 
perfect dice” (p. 99). By 1943 (Rhine 
1947) "We had, for instance, con- 
ducted tests in which an equal num- 
ber of throws were made for each 
face of the die” (p. 105). “But it was 
the decline evidence that settled 
every wavering doubt in our minds 
about PK" (Rhine, 1943, p. 105). 
From the published evidence, as well 
as personal conversations with a 
number of interested parties, it is 
clear that the question of dice-bias 
was of considerable concern and only 
the post hoc analysis of decline in 
scoring proved self-convincing. The 
serious defects in the experimental 
design and execution of these studies 
cannot be ignored and one reviewer 
(Soal, 1948) concluded that these 
“Duke experiments seem to have 
fallen into pitfalls that an intelligent 
school boy would have avoided” (p. 
185). None of these crucial weak- 
nesses in experimental design is recti- 
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fied by the postmortem report of a 
significant decline in hits in these 
already completed studies. By the 
most elementary standards, this con- 
stituted a new hypothesis subject to 
subsequent test. 


Negative Data 


The group of reports just con- 
sidered is characterized by the ab- 
sence of negative results. Frick's Solo 
test is the important exception and 
Rhine's post hoc rationalization has 
been noted (Rhine & Humphrey, 
1945a). No statement has been pub- 
lished concerning the total number of 
tests carried out or the proportion of 
results which were negative. The first 
independent negative findings are 
contained in the final two reports 
listed in Table 1. 

In what was one of the earliest at- 
tempts of wishing with dice, Nicol 
carried out a series of tests from 
1934-37 (Nicol & Carington, 1947). 
This study was the first in which all 
throws were recorded and all faces 
were used as targets in a systematic 
fashion. Results on each of four 
series of tests were insignificant. Of 
the total throws given in our tabula- 
tion (5,640 die throws unwitnessed), 
the most useful are the witnessed 
throws of Group 4. Here, each of 
eight Ss made 2,400 throws succes- 
sively for each of the die faces as 
target, in order from 1 to 6 (total 
—14,000 die throws/S). On these 
115,200 throws, the total deviation of 
+91 was of no significance. The 
analysis of the data by Carington 
included a detailed examination for 
declines but "all results have been 
null" (p. 174). Of the three analyses 
made, significance is attached to one. 
However, Carington noted, but 
if this work stood alone [without 
Rhine's reports], it would not be 
sufficient to warrant the acceptance“ 

of the PK hypothesis (p. 175). 
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The final report in Table 1 is the 
Minor Study by Nash (1944). It was 
carried out in 1940 with 113 Univer- 
sity of Arizona students, in ignorance 
of the work that was being prepared 
for publication at Duke University. 
All six die faces were used as targets 
with equal opportunity for all faces to 
serve as target. For all Ss combined, 
there was a grand total of 8,136 
throws with an insignificant devia- 
tion of +39 hits. Nash and his wife 
then carried out a control series with 
6 dice/throw for a total of 18,144 die 
throws. The scores, here, for each of 
Faces 1-3 were negative (com- 
bined = — 92), whereas the deviations 
for each of the target Faces 4-6 were 
all positive (combined = +92 hits) 
giving a remarkable overall deviation 
of zero for the entire control series. 

Some overall generalizations are at 
once apparent with respect to these 
basic data. If the test of precognition 
(Woodruff & Rhine, 1942) and 
McDougall's negative data (Rhine & 
Humphrey, 1943) are ignored, the 
positive data in Table 1 offer very 
impressive scores. But many of these 
high scores were obtained under pro- 
cedures and conditions which were, at 
best, informal. Only nine reports 
were carried out on the premises 0 
the Duke Parapsychology Labora- 
tory. Of these, five are entitled 
Minor articles, the rationale for 
which classification is not given. The 
remaining 11 reports (including 
McDougall's One Die) were carrie 
out in dormitories and homes bY 
students, professional and business 
people, and other interested ama- 
teurs. Four of these latter reports 
were based on data obtained by indi- 
viduals who worked alone, unsuper- 
vised, acting simultaneously as E, 9, 
and recorder. Thus more than 50% 
of these data were obtained by ama- 
teurs lacking direct professional su- 
pervision. 
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TABLE 2 
Later Dice Tests 


Author References E Ss Throw 
Herter & Rhine“ | (1945) | Herter 
Gatling & Rhine* | (19 16) Gatling i S 
Rhine (194 % 5] Schwartze 1(E) | c 
Pratt & Woodruff^ | (1946)> 2 (Es) | m 
H: (1945) 1 (E) |c 
Parsons (1945) 27 m,M 
Pratt® (1947a) 10 |e’ 
Dale (1946) 54 m 
Dale & Woodruff | (1947) 24 m 
oue 
Gibson (1947)? m 
Gibson 494005 25 m 
Nash À (1946) | student 9 b 
Nash & Richards | (1947) Richards 48 b 
Nash (1956) Bray 20 m 
McConnell (1955) 9 M 
5 et al. (1955) 393 c, M 
(1950 a 
En & Rose 619510 20 a 
ie (1952) 25 c 
(1955) 9 c 
Humphrey* (194 7a) 3 c 
5 5 (1947 b) — 1(E)|c 
houless (1951) — 1 (E) c, m 
Mitchell & Fisk | (1953) 10 c 
Fisk & West (1958 1 c 
Var ke Va 6020 2 : 
5 isse 1951) 
MEME I a |: 
Van de Castle* (1958) White al 8 EA 
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Dice 
throw CR 
6 4.65 
: s 
$ — 9.97 
2 — 
3 — 
6,2 — 
$ = 

171 a 
4 n 2.60 
6 "E 
m 9 

79 B 
it 2 2.55 

312 . 
ae 2 6.23 
3 (a) E? 
2 18,092 = 
2 169,776 — 
6 = 
12 = 
12 rs 
12 E 
6 3.06 
12 2.46 
4 2.99 
3 3.66 
3,2 3.76 
12 
2 3.61 
1 3.29 
30 5.24 


Note.—(a) Used six dice, wishing only for three dice; (b) wishing for specified colored sides of cubes; (c) wishing for H(ighs) 


and L(ows) on 1's fi i i 
: Duke ponte or different dice on each throw. 
x Minor Report. 
» as S, working alone. 


Later Dice Tests 


. The tests in this section were car- 
tied out after the development of the 
decline (in scoring) hypothesis. There 
was more concern given to experi- 
mental design, to avoid recording er- 
rors and to insure tests with all faces 
(or Highs and Lows) as targets in the 
= study. In some tests, two ob- 
À ers made independent records of 
PX E and all faces thrown were 
ED ES Unwitnessed solo efforts 
M ew. The reports by Schwartz, 
E Humphrey, and Thouless 
P ete this category. In a few 
a ia unsupervised students served 
om Also included in this group is a 
Blut of tests of one individual, 
A E discussed under Sensitives. 
Pom ular summary of the design 
18 $ $, as well as the exact references 

ontained in Table 2. 
Of the 30 studies listed in Table 2, 


9 originated in the Duke Parapsy- 
chology Laboratory. The first 2 of 
these Duke reports were carried out 
by unsupervised students in dormi- 
tories (Gatling & Rhine, 1946; Herter 
& Rhine, 1945). Two others were 
listed as Minor reports, the first part 
of Mangan's (1954) report was a solo 
effort and Pratt's (19472) test was a 
nonwishing control series. The re- 
mainder were the carefully executed 
studies, negative in character, by 
Pratt and Woodruff (1946) and 
Van de Castle’s (1958) combined 
ESP-PK test, Humphrey's interest- 
ing but inadequate Help-Hinder test 
(Humphrey, 1947a) and her High- 
Low solo effort (Humphrey, 1947b). 

The period is ushered in primarily 
by a number of independent attempts 
to confirm the early Duke results. 
The efforts of Nash (1944) and Nicol 
and Carrington (1947) have already 
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been considered with Table 1. In line 
with these findings are the negative 
results later obtained by Hyde (1945) 
and Parsons (1945). In addition to 
an insignificant deviation from 
chance, Hyde (1945) found no evi- 
dence for declines and concluded that 
“Positive PK results are not guar- 
anteed by repetition of the published 
American [Duke] technique" (p. 296). 
In a larger series, with more Ss and 
more than one technique, Parsons' 
(1945) data were also negative in 
terms of deviation and hit distribu- 
tion (i.e, absence of declines) with 
the record sheet patterned on the 
Duke record form. Dice bias to 
higher faces was detected. Also to be 
noted from the London SPR group 
(not included in Table 2) was Scott's 
(1947) report of negative results with 
the Cambridge group and West's 
(1954a) report of negative results in 
other SPR unpublished data. 
In addition there was the series of 
studies carried out by Rose and his 
wife. In the first study (Rose, 1950), 
E and his wife served as main Ss with 
supporting data obtained from 21 
friends. The dice were released down 
an inclined plane when observer with- 
drew the ruler upon which the dice 
had rested. All trials were witnessed 
but Ss “were allowed for the greater 
part of the series to select their target 
faces and consequently there were not 
equal numbers of runs for each face 
of the die” (p. 116). With arbitrary 
stopping in addition, "to change to 
more rigid and well-balanced pro- 
cedures" (p. 117), the “loose design 
of the experiment“ (p. 125) requires 
no further comment. 

Although the pooled deviation was 
insignificant (b=.16), the wife's re- 
sult was considered significant, with 
an overall score of 4-83 hits on a total 
of 3314 runs. Her score of +85 hits 
in one section of 125 runs when the 
six-face was the target should be 


noted. The lack of the several face 
scores prevents pinpointing the source 
of E's +41 hits on his 393} runs 
(who preferred the five-face as tar- 
get), whereas the other two main Ss 
(total 2561 runs) and the remaining 
miscellaneous friends (total 375 runs) 
all scored overall minus deviations. 
A control series of 400 runs by a 
friend was reported in which there 
was a minus deviation on each of 
Faces 1-4 with +25 hits on Face 5 
and +28 hits on Face 6. Dice bias 
was clearly confounding the experi- 
ment, a possibility which should 
have been obvious to the authors 
from the already published studies. 

In a combined ESP-PK study 
(Rose & Rose, 1951), colored cubes 
(without indentations) "accurately 
shaped by an engineer" (p. 129) were 
used in the latter phase with 20 
Australian aborigines. Although 
there was optional stopping, each § 
had a minimum of 24 runs, with an 
equal number of throws for the sev- 
eral faces. Rose supervised and 
called the results which were checked 
by his wife who recorded only hits. 
With marginal success for one $ 
(+108 hits on 600 runs), scores for 
given face colors were not reported. 
The CR of 1.61 for the pooled data 
was insignificant. i 

A further study (Rose, 1952) was 
now carried out with Duke dice, using 
25 Ss, and using all faces as targets, 
again in a combined ESP-PK tests. 
Three sets of data were obtained with 
the dice in different locales, the re- 
sults of which were all insignificant: 
+128 hits on 1896 runs, —27 hits on 
168 runs, and +7 hits on 1128 
runs. The pooled data, presumably 
for all 25 Ss combined, were equally 
insignificant. i 

In a follow-up (Rose, 1955), again 
as part of a combined ESP-PK study, 
aboriginal Ss threw from a hand 
shaker, in which competition was 


'encouraged, with only hits being re- 
corded. Deviations for all Ss, includ- 
ing the best previous S (Rose, 1952), 
612 runs) totaling +6 hits 
(CR=.12). 
A series of night tests was reported 
by McConnell (1955), in which the 
targets were wished for by Ss before 
going to sleep. Scores for the ma- 
chine-rotated pair of dice were photo- 
graphically recorded. Several target 
arrangements were used, e.g., doubles, 
as well as a sequence involving all 
faces.  McConnell's score was re- 
ported as significant (+66.5 hits) 
but the overall tabulation of +73.5 
“hits for 18,092 die throws of all Ss 
combined was not significant (p = .14). 
The most satisfying experimental 
component consisted of tests in which 
each S selected a different given 
] target face, 1 to 6, for each succeeding 
night. The score for the pooled data 
for this organized section was —26.3 
hits for the $8,534 die throws 
(CR=.76). 

Overlapping with the night tests, 
was the day series carried out in 
1948-50 with photographic recording 
of 393 wishing Ss. The results by the 
Criterion of target hits was entirely 
hegative (McConnell et al., 1955). 

his study with the Duke dice ma- 
chine was repeated by Dale and 
Woodruff in 1951-52 in a combined 
PK-ESP test with 108 Ss and 62,208 
dice readings and the results were 
insignificant with respect both to hits 
and declines (Murphy, 1952b). On 
the PK test, the overall deviation was 
—13 hits, ‘fantastically close to 
Chance deviation" with zero devia- 
tion on the first half of 31,104 die 
readings.‘ 

Finally, there is the extremely long 
series of throws (not in Table 2) re- 


x La n t 
April 1950, Dale, personal communication, 
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were insignificant, the pooled data - 
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ported by Scarne, an avowed and 
scornful skeptic. His criticisms of the 
use of ordinary commercial dice, and 
the care with which his dice were 
measured for accuracy before ac- 
ceptance are extremely pertinent 
(Scarne, 1956). Precision measuring 
instruments were used to select true 
dice and each pair was discarded after 
every 3,600 throws. Scarne, wishing 
for 7's, scored for Lows (2, 3, 4, 5, and 
6), Highs (8, 9, 10, 11, and 12), and 
7's. Beginning in 1940, he continued 
as opportunity permitted for about 
15 years, stopping aíter 6,000,000 
rolls. By chance there should have 
been 1,000,000 7's and 250,000,000 
hits for Lows and Highs, respectively. 
The final tabulation reported was 
2,499,998 Lows and 2,500,001 Highs, 
the remainder turning up as 7’s. The 
significance of the difference between 
these results and those obtained by 
Weldon (Fisher, 1938) whose dice 
were obviously high-score biased, is 
readily apparent. 

There are to be noted in this period 
more attempts to test for the role of a 
psychological factor in dice throwing. 
Two such series are those of Dale 
(1946) and Nash (1946) which have 
been interpreted as evidence for PK. 

The four studies at the ASPR will 
be considered as a unit (Dale, 1946; 
Dale & Woodruff, 1947). The initial 
study, with positive findings, was 
carried out in 1946, and the other 
three, with negative results, were 
completed in 1947. In the first study 
(Dale, 1946) each of 54 Ss (29 female, 
25 male) had one session in which 
dice shaken in a cup were cast down a 
chute, landing on a platform. The 
targets, respectively, were in the fol- 
lowing progressive sequences: Faces 
1-6, 2-1, 3-2, etc. The Ss were 
divided into six equal groups, with 
one of the target sequences assign 
to each of the subgroups. There was 
an equal number of trials for each 
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target face. Records were kept of 
every die face cast. Both S and E 
kept records and any discrepancy, 
during comparisons after each run, 
was resolved by taking the lower 
score. A dozen sessions or so were 
witnessed in whole or in part by ob- 
servers. 

The Ss comprised two groups, 
those who accepted the possibility of 
the influence of mind over matter 
(N=41) and those who did not 
(N —13). The difference between the 
Believers with a mean of 4.117 (4-116 
hits) and the Nonbelievers with a 
mean of 4.176 (4-55 hits) was sta- 
tistically insignificant. Declines in 
scoring were noted from run to run on 
the data page. There was an incline 
(increase in scoring) from page to 
page, resulting in twice as many hits 
on the last three pages as on the first 
three pages. It was noted by Dale 
that in Nash's (1946) study “as in 
our experiment, both groups scored 
well above chance" (p. 144) and that 
the attitude expressed by the S 
toward the possibility of PK was a 
variable of no importance" (Dale, 
1946, p. 132). 

In the second and third studies 
(Dale & Woodruff, 1947) both Es 
made independent records. In the 
second study (24 Ss) there was also an 
electrically operated randomizing 
chute and photographic recordings of 
die faces. In the third test (54 Ss), 
the electrical gadgets were elimi- 
nated. In the fourth, and final study, 
Dale worked alone to replicate com- 
pletely the conditions of her first 
study. The results of Experiments II, 
III, and IV were entirely negative. 
Analysis of the throws in these three 
tests showed that Faces 5 and 6 al- 
ways gave positive deviations and 
Faces 1-2 always scored negative 
deviations. The striking declines in 
Experiment I, were entirely absent in 
II and III, with an incline detected 
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in IV which was opposite to the effect 
obtained in I. There was no increase 
in number of hits in last three data 


pages of Experiment IV, as noted in 


Experiment I. There was no support 
for the hypothesis derived from Ex- 
periment I that females score better 
than males. The most likely inter- 
pretation, therefore, of the positive 
results in Experiment I is the same as 
that concluded for the three subse- 
quent negative studies: that “no 
clear evidence for the operation of 
psychokinesis was found" (Dale & 
Woodruff, 1947, p. 79). 

The initial study by Nash (1944) 
was followed by several subsequent 
tests which were essentially psycho- 
logical in design. Not as well con- 
trolled as some others, such as Dale 
(1946, 1947) and Van de Castle 
(1958), with the use of undergraduate 
students as Es, less confidence is to be 
placed in the reliability of these data. 
A test of Believers versus Nonbe- 
lievers (Nash, 1946), was carried out 
by an undergraduate student who 
counted and recorded the hits. The 
Scoring was confirmed by S. About 
one-half of the rolls were made at à 
distance of 3 feet from the release 
box, the remainder from a distance of 
30 feet. The number of runs rolled by 
each S varied from 48 to 176, and the 
only data reported were for the 
group as a whole. Because of Ss 
target preferences, there were 112 
runs for Faces 2 and 5; 128 for Faces 
1, 4, and 6; and 144 for Face 3. For 
the group as a whole, the pooled data 
showed positive deviations for all 
faces. A typical QD (greatest differ- 
ence between first and fourth quarter, 


diagonal decline) was reported but : 


the difference was not statistically 
significant. Mean for Believers was 
4.43 and that for the Nonbelievers 
was 4.34 (chance —4.00/rum) with 
the difference reported as insignifi- 
cant. No significant differences 0C- 


nl. s 
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curred between near and far dis- 
tances. This study (Nash, 1946) was 
quoted by Rhine as evidence for Ss 
doing as well at far distances as com- 
pared to near distances (Rhine, 1953), 
but the negative results on distance 
reported the following year (Nash & 
Richards, 1947) were ignored. This 
latter follow-up was carried out by 
Richards, as an undergraduate stu- 
dent in zoology. Each S had two 
tests, a month apart, for a total of 32 
runs, with split halves of Ss arranged 
for R(eward) and No R(eward) at 3- 
feet and 30-feet distances from ap- 
paratus. Complete randomization of 
targets was not obtained. 

There was a total of 256 runs for 
each of the six target faces and the 
overall score of +158 hits (5 —.014) 
was considered suggestive. The 
Scores on the several target faces 
were: Faces 2 and 3 combined — 4-43 
hits, Face 1 was —23, and Faces 5 
and 6 combined was +136, with zero 
deviation for Face 4. But Nash and 
Richards (1947) stated that the “hy- 
pothesis of dice bias cannot account 
for the results" (p. 274). On the 
distance test, there were +39 hits at 
3-foot distance and +119 hits at the 
longer distance, but the authors con- 
clude that one cannot say that this 

gave a conclusive difference" (p. 
279). Decline in scoring was not 
typical; instead, there was general 
tendency to inclines. 

In what was essentially a Wish and 
No Wish test, data were collected in 
1948 by Bray, an undergraduate 
Student assistant (Nash, 1956). Tar- 
gets were selected by card-drawing 
from a shuffled deck (numbered 1 to 

There were six dice/throw, three 
ted and three white on every release. 

n Series 1 and 2, S selected which set 
Color) was to come up with the given 
target face. The pooled data for all 
three series were given in terms of 
the score for the (three) wished-for 
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dice, the deviation being insignificant. 
Inadvertently or not, there was an 
obvious built-in control on wishing. 
If one compares Series 1 and 2, one 
has a direct test of the wished-for and 
ignored sets of three dice. The scores 
for the two sets were, respectively, 
—25 hits and —41 hits. Presumably 
all six faces served as target, although 
target order and face distributions 
were not reported and the obvious un- 
importance of wishing was ignored. 

Also to be noted under the psycho- 
logical rubric were two Duke studies 
by Humphrey, with an interesting 
design feature but unfortunately 
marred by serious weaknesses. In the 
Help-Hinder series (Humphrey, 
1947a), with “three different Ss par- 
ticipating" (p. 4), a thrower wishing 
for self-chosen target was helped or 
hindered by the wishing of an ob- 
server, as decided by the latter. In 
the first situation, the latter wished 
for S's target: in the second, the ob- 
server wished for some nontarget 
face. S was presumed ignorant of the 
observer's attitude (help or hinder). 
But since E required more time to 
record scores both for S and observer 
when latter was hinder-wishing (i.e., 
two entries), S must have had some 
inkling of observer's attitude. Only 
hits were recorded by E, so that face 
distribution for all throws is not 
known. The number of runs for 
given targets, as chosen by S, was not 
equal, nor was the experiment of a 
predetermined length. Dice bias was 
considered irrelevant since positive 
deviations occurred on all six faces 
serving as targets. 

On the equivalent of 177 runs, ob- 
server wishing for S’s target, the 
score was +95 hits. On 213 hinder- 
runs S scored +32 hits, whereas 
observer scored +12 hits on her self- 
chosen (different) target on the same 
throws. Combining the scores on the 
hinder trials, as Humphrey submits, 
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the score becomes +12 hits on 426 
runs which when combined with the 
help runs constitute the totals given 
in Table 2 (+139 hits on 603 runs). 

The actual number of runs was 390, 
but with each run scored twice (S and 
wisher) the reported total was 603, 
with plus deviations on all die faces 
as target for a total of +139 hits 
(Table 2). It is unfortunate that the 
experimental design was not more 
rigorous, with equal representation 
for all faces as target in some ran- 
domized order and two independent 
recorders ignorant of target designa- 
tions. The possibility of recording 
errors in this situation constituted a 
serious weakness.  Equally critical 
was the failure to incorporate some 
balanced order of wish (help or 
hinder) with nonwish sequences. 

In a simultaneous High-Low test 
Humphrey (1947b), working alone as 
S, E, and recorder, used six red and 
six white dice on every throw. Always 
scoring for 1's, S for one-half of the 
runs wished the red to be high and the 
whites to be low; and on the remain- 
ing runs wished for the white to be 
high and the red to be low—in all 
cases with respect to 1’s. Overall 
score on high wish was +45 hits 
(insignificant) and for low wishes 
—179, the latter reported as signifi- 
cant. An unavoidable change in dice 
occurred during the study.  Unfor- 
tunately all experimental functions 
were performed by one person. Aside 
from the problems of trying to keep 
different mental sets for the two colors 
of dice, the tendency to read the dice 
incorrectly, absence of independent 
(naive) recorder (who did not know 
the mental sets of S) were irreparable 
flaws. 

In this connection, the carefully 
controlled study on personality and 
PK reported by Van de Castle (1958) 
should be noted. Carried out by 
White (who has reported a series of 
successful ESP tests) a variety of 
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personality tests were given in con- 
junction with ESP cards and draw- 
ings, as well as a PK dice test. No 
relationship was reported between 
performance and any of the tests, 
including sheep-goat dichotomy 
(Schmeidler), expansive-compressive 
ratings (Humphrey), Rosenzweig 
picture-frustration, Rorschach, or IQ 
scores. All Ss had equal opportunity 
for all die faces as target in a prede- 
termined order not known to E (S 
having been instructed not to reveal 
target until the test was completed). 
The total deviation of +32 hits for 
the 31 Ss was insignificant, with an 
overall incline of +26 hits from first 
to fourth quarter. Thus the usual 
analysis of the data from this experi- 
ment offer no support for an inter- 
pretation of PK" Van de Castle, 
1958, (p. 136). 

There remain to be considered a 
few reports, of slight consequence, 
which can be briefly summarized. 
With a fellow undergraduate as S, 
Herter served as recorder. With 
equal representation of all faces as 
target, S was soon free to begin with. 
any face and choose any order of 
targets. Positive deviations were 
recorded for all target faces. The ex- 
periment was halted when S left 
school. It was the first and only ex- 
perience with PK testing for both 
C. J. H. and his S” (Herter & Rhine, 
1945, p. 24). Rhine commented that 
there is no guarantee that the results 
could be duplicated by them "for 
their motivation would not be the 
same“ (p. 24). 

Gatling, another unsupervised stu- 
dent, acted as E in some sessions and 
as S in others (Gatling & Rhine, 
1946). In this study, four amateur 
gamblers were matched against four 
ministerial students one of whom was 
Gatling. At the beginning, Ss were 
free to pick from three sets of dice and 
to select the target at the beginning o 
each column of the score sheet. Later 
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in the experiment, target choice was 
curbed since it was intended to equal- 
ize the number of throws for each 
face as target. There was large varia- 
tion in the number of runs contributed 
by each S. The combined total for 
the four Ss, "regarded as successful 
at the crap game" (p. 120) was 540 
runs compared to the combined total 
of 702 runs for the four ministerial Ss. 
All Ss combined, there was a plus 
deviation on each die face as target. 
It was reported that the decline [QD 
of the page] was not typical. This 
experiment covered a 2-week period 
in June "and was terminated by 
W. G.'s departure from the univer- 
sity” (p. 12). 

The penultimate study by Gibson 
(1947) consisted of an "informal 
interim report," with apparatus simi- 
lar to that of Dale (1946). Half the 
data were obtained in one long session 
and the remainder were collected in 
two successive monthly tests. The 
statistical significance attached to the 
pooled data (Table 2) is due to the 
score on the second part (+52 hits on 
72 runs) of the first session—in which 
the scores were checked by S and the 
wives of S and E. This session was 
started with S and E having a beer, S 
feeling relaxed as the test began. In 
the final study of Gibson (1948) with 
independent recording by S and E 
ànd an orderly arrangement of target 
Sequence there was an overall insig- 
nificant deviation from chance and an 
absence of typical decline. 

The  unwitnessed husband-wife 
team of Vasse and Vasse (1951) re- 
Ported significant results only for the 
wife who “had a marked effect" in a 
Previous attempt to influence the 
growth of seedlings (p. 264). Her 
Scores of +69 on 135 runs for High 
and 4-45 for 111 runs for Low targets 
account for the significance of their 
Pooled data in Table 2. 

In an exploratory ESP-PK study, 

Sis (1953), wife, and friend served as 
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Ss in a series of single die throws for 
hidden targets. The friend, working 
alone and recording her own data, 
completed her final trials on a vaca- 
tion, after a lapse of some time. With 
all data pooled, p is given as .0004. 
There was inequality in targets, the 
number of trials was not the same for 
all Ss, nor was the same die used 
throughout. Because of the explora- 
tory nature of this effort, no attempt 
was made to use “full precautions of 
standard test conditions” (p. 301). 

In this category belongs Mangan's 
(1954) first published report working 
alone for the first half and with an 
observer who recorded during the 
second half of the test. Following a 
preliminary test, there were negative 
deviations for Faces 1-3 and positive 
deviations occurred with Faces 4-6. 
For the main experiment, "it was 
predicted that the scores on the high 
dice would be higher than those on 
low dice" (p. 210). Overall, High 
Face score— 242 and Low Face 
score=+45. Typical decline (QD) 
was absent. 

In Knowles (1949), a statistically 
significant result was obtained by 
pooling data from a preliminary test 
(not in Table 2) with the main series. 
These combined data were statisti- 
cally indistinguishable from a subse- 
quent control series with a single 
(actually) loaded die weighted on 
Faces 4-6. 

In the test by Pratt and Woodruff 
(1946), a copy of the experimental 
design was deposited with the librar- 
ian before the tests were started. 
With equal representation of all the 
die faces as target the dice were re- 
leased from rotating tubes against a 
barricade. The results were not differ- 
ent from chance. The scores for Faces 
1-2 totaled a minus deviation with 
the largest positive scores obtained 
on the higher target faces. This care- 
fully executed study was labeled 
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Belonging in this category is Thou- 
less' (1951) unwitnessed solo and 
casually executed effort, fitted into 
odd times when I happened to have a 
spare twenty minutes" (p. 108), 
which deserves mention even if only 
marginal significance is attributed to 
the data pooled from different design 
conditions. It is one of the two re- 
ports on record incorporating, at 
least in part, a latin square design for 
target designations (also cf. Thouless, 
1945a, 1945b). 


WISHING WITH OBJECTS OTHER 
Tuan DICE 


In Table 3 are summarized the few 
reported tests making use of objects 
other than dice, primarily discs and 
coins. 

On the first of the three studies 
McMahan (1945) making use of discs 
(a) tested college students individu- 
ally and then (b) a group of children 
at adolescent PK parties in the home 
of one of the Ss with prizes awarded. 
In a, the main test involved plastic 
discs which were released through a 
system of baffles onto a table. Ss 
wished for the objects to come to rest 
with a designated face up. In 5, the 
group helped in concentrating. The 
scores in both parts of the study were 
statistically insignificant. The second 
study was essentially like the group 
party-situation already described 
(McMahan, 1946). Again, the over- 
all deviation was insignificant 
(CR=.67), but significance was 
attributed to difference between first 
and fourth quarters of trials. 

In the final study (McMahan, 
1947) more or less the same experi- 
mental situation obtained, with equal 
number of trials in a normally lighted 
room and in a dark situation in which 
scores were recorded by flashlight. 
Overall deviation was again insig- 
nificant, but the light-room score was 
— 61 hits, and dark situation was +54 
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hits with a reported CR of 2.45. “Ss 
on the whole preferred the dark“ (p. 
51), and there were reported striking 
declines in scoring rates for first and 
second parts. Gibson has reported 
opposite results in a dark test. With 
a social situation involving children 
and adolescents and flashlight scoring 
in darkness, it would appear to be 
mandatory to have two independent 
recorders, not bothered with handling 
the experimental details. 

Solo and unwitnessed, Thouless 
(1945a, 1945b), did his coin tossing 
from a ruler, on which coins were put 
inrandom order. Surprisingly enough, 
although the mathematicians and 
statisticians have traditionally em- 
ployed coin tossing in relation to the 
binomial theorem, it has been little 
used in the present connection. Ten 
coins were arranged half heads, half 
tails on a ruler (in mixed order) and 
Thouless did not look at them when 
tossing. Overall, there were, respec- 
tively, —16 heads and 4-58 tails for 
each side as target. Thouless (1945b) 
remarked that “It is obvious that the 
result is not of any value as independ- 
ent evidence for PK” (p. 169). Work- 
ing alone, Bailey (Pope, 1946) glass- 
tossed a penny onto a rug for 100 
trials per session, and also made a 
series of single-die throws. The 
journal (Pope, 1946), commenting 
that these attempts offer suggestive 
data on the comparative success © 
dice and disks in PK experiments 
(p. 213), presents a combined score 
for the two procedures with a $ of 
012. The report by Binski (1957) was 
the thesis for which the PhD was 
awarded. The results with coin tos- 
sing and roulette wheel guesses for 
red or black were insignificant. An 
additional series with one & (not in 
our tabulation) which had no pre 
arranged experimental plan and in- 
volved optional stopping was con- 
sidered highly significant; and some 
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TABLE 3 
WIismNd wits Osyects Orner THan Dice 


Author | References} Objects 

McMahan* (1945) discs 46,000 
McMahan» | (1946) | discs 12:600 
McMahan* | (1947) discs 8,800 
Thouless (1945b) | coins — * 4,000 
Pope (1946) | coin Bailey* 1,000 
E die KE) 216 
Binski (1957) coins 117 153,000 
roulette 123 26,200 

* Duke Report. 


L Minor Report. 
E, as S, working alone. 


new unpublished tests were “en- 
couraging but not statistically sig- 
nificant“ (p. 290). 


PLACEMENT-WISHING 


In this approach, released objects 
were wished to land in designated 
areas. The reports, beginning in 
1951, are listed in Table 4. 

- The first placement test, an innova- 
tion in PK research, was done by a 
business man, Cox (1951). It con- 
sisted of a combined test of hits for 
given target faces and placement 
Scores in 252 marked-off squares, 
Reed from 1 to 6 in such fashion 
i at no two adjacent areas were given 

€ same number. There were several 
d of varying conditions, only the 
^n of which Cox considered experi- 
Eu adequate. The two objec- 
$ es—hits and area were alternated 

Primary and secondary targets and 
Ben instructed to concentrate on a 
we Primary target and ignore the 
E, E always called aloud the 
the sc or the primary area, counting 
ERU for the secondary objective 
ie T But scoring was always done 
us 15 bits on die faces whether it 

is € primary or secondary target. 

get designation and scoring was 


done by E who progressed from one 
to six die faces as targets. 

Combining the scores on dice hits 
and area placement into primary and 
secondary targets, the 1,632 standard 
runs (number of item throws+by 24) 
yielded scores of +100 for the pri- 
mary (CR of 1.36) and —426 for the 
secondary targets (CR of 5.78). In 
Series III, the only positive deviation 
was +18 hits when the area was the 
primary target. There was reported 
a significant score of —139 hits when 
the respective objectives (hits or 
placement) were secondary targets 
(576 runs in each case). 

Considering the difficult recording 
task, and the absence of independent 
checking, one need not be too con- 
cerned with the large negative devia- 
tions noted when the given target 
area or dice faces was secondary tar- 
get. Details are lacking on the distri- 
bution of scores for hits or area place- 
ment and only the pooled mean scores 
for primary and secondary targets 
were submitted. E recognized that 
the experiment (Cox, 1951) was 
equivalent to "unwitnessed observa- 
tions and recording” (p. 43). 

The second report by Cox (1954) 
belongs with the present group in 
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terms of general technique. Over a 
period of 2 years, 24 dice and an equal 
number of marbles were released on 
every trial for placement test alone, 
into an area divided in half by a fine 
wire, much like the Cormack ap- 
paratus (cf. Pratt, 1951). With vary- 
ing numbers of Ss in three series of 
tests, carried out at different times 
over a period of 2 years, S attempted 
to wish for hits in a specified section 
area. Unlike the first study, target 
and nontarget areas were alternated 
on successive trials. The reported 
pooled data are given in Table 4. 

To test the difference between the 
dice and marbles, Ss concentrated on 
both types of objects in the first 
series, on dice or marbles in the 
second series, and one or the other as 
preferred in the final series. Primary 
and secondary targets were scored in 
the last two series but, unlike the first 
report, the difference was insignifi- 
cant. "Throughout, the score with 
marbles was positive (+102 area 
hits) and that with dice (consistent 
with first report) always was negative 
(—162 area hits). All these data 
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were pooled, giving a CR of 2.93 for 
significant differences between dice 
and marble (p=.003). The experi- 
mental inadequacies of the initial 
effort were equally prevalent in this 
follow-up by Cox. And no subse- 
quent attempt has been made to 
replicate these findings under care- 
fully controlled conditions. Instead, 
a report of "exploratory nature" is 
made 5 years later on a new tech- 
nique, i.e., a three dimensional place- 
ment test (cf. Cox, 1959). 

Following a suggestion from Pratt, 
Cormack, a retired businessman 
working solo, carried out a number of 
placement tests (Pratt, 1951). With 
variations in size and weight and 
number of objects, releases were 
made manually or electrically down 
an inclined plane into two horizon- 
tally divided areas. S willed for dice 
to end in one or the other section. 
The significance of each of the nine 
series varied, but by pooling the 
scores a significance of better than 
p=.000001 is obtained. 

L. E. Rhine (1951) also reported a 
large scale test of placement-wishing. 


TABLE 4 
PLACEMENT WisHING 


used was not speci. 

HTE, Soe objects in target and nontarget areas. 
b Minor Report. 

E, as S working alone. 


area, In Knowles! test, rotating 
a 


In the latter study, a nonspecified 
mumber of Ss, including E, were 
tested. with objects released by lever 
down an inclined plane, “willed to 
fall to left or right“ on table below. 
— Marbles, coins, and cubes were used. 
In this study, there was an attempt 
to check the recording-error prob- 
lem; E read the score aloud and S 
checked the recording; any object on 
the dividing line was thrown again. 
With a miscellaneous number of Ss, 
for 46,800 released objects, there was 
a total of —28 hits. With a number 
of other Ss, a total of 66,300 releases 
yielded +176 hits. For the pooled 
data the CR of .88 was insignificant. 
With encouragement from Rhine, 
the following test was carried out by 
Wilbur and Mangan (1956). Glass 
marbles and steel balls were electri- 
cally released down a runway onto a 
surface with six slots. Right and left 
Sides were alternately designated as 
targets. Checker called out scores 
Which were entered by a recorder, 
both then checking the entries. The 
Score with marbles in the first series 
resulted in +31 hits, the steel balls 
Score totaling —6 hits, and the overall 
deviation was insignificant. In a 
Second and final study, a test was 
made with three degrees of roughness 
Of the incline plane and glass balls 


alone. The overall deviation was 
Msignificant. 


A preliminary report by Knowles 
(1952) with E and her brother as Ss, 
Was an attempt to wish a rotating 
Pointer, spun by hand, to stop at 
siven target segments, with a paper 
Scale divided into 15 such areas. 
Ores were obtained in terms of hits 
and amount of deviation from given 
target areas. Both Ss had an equal 
Number of trials, with p=.0008 for 
"eir pooled scores. E acted as re- 
* "der for herself and her brother, 
except that her husband recorded for 
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her third session. Working at home, 
Steen (1957) a California business- 
man reported an “exploratory” ap- 
proach with one S in which dice 
throws were scored in terms of base- 
ball rules applied to a PK test. This 
was considered by Thouless (1958) as 
"distinctly encouraging" and "For 
those not interested in baseball, the 
method could, no doubt, be adopted 
to the rules of cricket" (p. 205). 


SENSITIVES 


In starting the PK program, Rhine 
made no attempt to search for in- 
dividuals who would be particularly 
“gifted.” In the initial phase of the 
dice tests, Ss for the most part, there- 
fore, were unselected. Almost 20 
years later, Rhine (1953) commented 
that “no special person had to be 
sought out as mediums; in fact, my 
wife and I began in 1934 by testing 
ourselves, our family members, our 
friends, students, and even casual 
visitors” (p. 140). However, some 
PK tests were carried out with Ss 
presumed to be sensitive because of 
earlier successes in ESP tests. Gib- 
son's S was previously successful in 
card guessing (Gibson, 1947). Shakel- 
ton, one of Soal's famous Ss (Soal & 
Bateman, 1954), was given a PK test 
by Parsons (1945) but the results 
were negative. 

Belonging in this category too are 
the dice tests with Blundin which are 
considered below in detail. Forwald's 
largely solo efforts in placement wish- 
ing are also included here since con- 
tinuing success over a period of years 
would be presumptive evidence of a 
sensitive. 


Blundin 

The series with Blundin was car- 
ried out in England, at first, included 
among a group of 10 5s, largely un- 
witnessed and doing their own re- 
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cording on a variable number of trials 
for targets known only to E. In these 
results, obtained by Mitchell and 
Fisk (1953), as given in Table 2, 
"targets were not exactly equalized” 
(p. 49) and “the experiment may be 
recognized as a pilot study in which 
the possible effect of dice bias was 
not definitely excluded” (Fisk & 
West, 1957, p. 6). Blundin's score 
was recognized as of marginal signifi- 
cance only. 

In a subsequent test with Blundin 
alone, S became ill and the tests had 
to be abandoned” after more than 
5,000 die throws had been made 
(Fisk & West, 1957, p. 3). While 
Blundin was checking the data analy- 
sis by Fisk, from the original data 
sheets, these records were lost in the 
confusion following her entry into 
hospital" (p. 3). Although the in- 
tended design was a latin square, it 
was incomplete and the situation in- 
volved, even if inadvertently, op- 
tional stopping. 

The next tests with Blundin (Fisk 
& West, 1958) were started when her 
"recovery by the spring of 1955, was 
sufficient" (Fisk & West, 1957, p. 4). 
With a new set of dice, tests were 
carried out on 6 successive daysof each 
week, 48 die throws per day. Fisk 
and West, on alternate weeks, ex- 
posed targets in ignorance of the 
other's targets. The recording by 
Blundin of her throws were witnessed 
by an observer on about one-third of 
the trials. The targets for 1-6 faces 
were chosen from a table of random 
numbers. The total die throws were 
1,440 with Fisk and 1,392 with West, 
but the authors (Fisk & West, 1957) 
concluded that a significant differ- 
ence between them “has hardly been 
established" (p. 4). 

In the final phase 2 years later, the 
targets were mailed to S in sealed 
envelopes, each of which contained a 
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given target and marked alphabeti- 
cally A-F in randomized procedure. 
There were 600 die throws for each 
target and 100 die throws per day. 
"In previous experiments, Dr, 
Blundin had thrown three dice in each 
cast. As she did her own recording, 
generally without witnesses... it 
was thought that by using only two 
dice at a time, likelihood of such 
errors would be greatly reduced" 
(Fisk & West, 1957, p. 5). In this 
series, there was a score of +48 hits 
(p —.032). 

The pooled data for the four 
series are given in our tabulation 
(p=.00017). A technique was de- 
vised of scoring each throw in terms 
of D(ie) O(rientation) to obtain 
quantitative estimates of success by 
grading throws in terms of direct 
hits, 90% off- and 180% off-hit posi- 
tion (cf. Mitchell & Fisk, 1953, 1954) 
and the reported DO scores were 
somewhat better than the hit crite- 
rion. The authors concluded that 
“the results of these five experiments, 
spread over a period of six years [did 
not] provide fairly satisfactory sta- 
tistical confirmation” of a PK effect 
(p. 5). Blundin, in another connec- 
tion, offers pertinent sources of error 
in one of the rare introspective re- 
ports of a card-guessing S after she 
became a member of SPR (Blundin, 
1952). In any case, lack of complete 
randomization of targets, absence 0 
witnesses, and independently re- 
corded data, were irreparably basic 
defects—subject to no possible sta- 
tistical rectification. 


Forwald 


The result of the first of a long 
series of tests by a Swedish engineer: 
Forwald, was published by Rhine 
(1951). Of the entire series, only ons 
of the last involved a substantia 
number of Ss (Pratt & Forwald, 
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1958). For the most part they con- 
stituted a solo series. 

An electrically controlled con- 
tainer released the dice down a run- 
way onto a walled table, S wishing for 
placement into one or the other of the 
equally divided horizontal sections. 
Successive groups of five trials were 
wished for, alternating the A and B 
sides as designated target area in the 
first series, and the ABBA order in 
the second series. In the first series, 
the two areas were divided by ink 
line and in the second series by a 
wire. The pooled data for the two 
series, (A area— 4-727 hits and B 
area — — 227) in Table 4, were given 
a p of .00000006. The scores on two 
respective nonwishing, control series 
were +104 hits for A area and —196 
hits for B area (11,000 die releases). 
Rhine (1951) noted that the per- 
sistent favoring of the A-side was 
"due no doubt partly to the align- 
ment of the dice channel or some 
other lack of structural symmetry in 
the apparatus" (p. 51). 

In the next test, the two place- 
ment areas were divided with a fine 
wire to force released cube to lean 
one way or other (to reduce judg- 
Ment factor in scoring) and, with five 
trials as a unit, the order of target 
area was ABBA. Nothing was said 
of target area A-bias, or its possible 
relation to an overall score of +320 
hits on A side and —263 hits on B 
Side (Forwald, 1952b, p. 60). With 
Fespect to the pooled, insignificant, 
deviation of +57 hits (p =.49), Un- 
doubtedly the explanation lies in 
Some important though subtle, 
change in my motivation” (p. 62). 
š For the next report, in each of three 
AD one kind of material (three 
1 5 was wished for and, as a con- 
Eine other trio of a different ma- 
"s y was ignored. The wishing now, 

Stead of “strong-willed” as previ- 


375 


ously, was in a "calm manner." 
From here on, only the combined 
score of hits for A and B targets was 
given, and no reference was made to 
A-side bias. U curve of hits previ- 
ously noted failed to appear. The 
pooled data for this study was re- 
ported with a p of .00007 and the 
deviation for the control component 
was insignificant (Forwald, 1952a). 
In view of the variations in condi- 
tions, some reservations areinevitable 
concerning the pooled value for all 
three experiments of +691 hits (with 
a p of 3in 10,000,000). 

In Forwald (1954b), two colleagues 
were used as Ss. Although the overall 
deviation was insignificant, the two 
Ss showed a significant decline in 
success from first to second half of 
trials during the first part. Appar- 
ently Ss began to lose interest, as 
there accrued an insignificant num- 
ber of hits, and Forwald therefore 
informed them about the decline 
effect to stimulate interest. In the 
final half of the study the significant 
decline failed to appear, and This 
suggests an inhibiting influence of 
conscious knowledge upon the effects” 
(Pratt, 1955). 

Reverting to solo performance, 
Forwald (1954a) substituted for the 
number of (A and B) area hits, a 
criterion in terms of mean score. The 
entire area was marked off in spaced 
grids, and the position of each cube 
was measured to obtain a mean value 
for the respective A and B target 
areas. The mean scores for the two 
areas both were in the expected 
directions. The falls were photo- 
graphed in one series and sent to 
Duke for independent measurement. 
The score was insignificant, but in 
the first (unphotographed) series 
measured visually by Forwald, the 
difference was significant. 

A comparison was now made be- 
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tween two individuals working alone 
and two other Ss working in competi- 
tion (Forwald, 1955a). The combined 
score for the two Ss who worked in 
competition with each other was 
insignificant but that for the two .Ss 
who worked individually was signifi- 
cant. Of all four individuals, the only 
significant score was that of Forwald 
(one of the individual workers) ‘‘act- 
ing as both S and E“ (p. 46). A con- 
trol series "showed some relation to 
the PK placement series" (p. 51). 

The primary consideration in the 
next report (Forwald, 1955b) was the 
relation of cube roughness to first 
throw effect. It is of some interest, 
considering the original A-side bias, 
that smooth and first stage roughness 
cubes scored a (—) minus mean score 
(on A side) whereas very rough cubes 
scored a plus (+) mean deviation (on 
B side). 

In the next solo effort, with respect 
to a consideration of the experi- 
mental and control series, it was 
observed that there must be a 
substantial difference . . . when con- 
Scious attempts are made to in- 
fluence the dice and when the subject 
is mentally neutral . . . [But] the re- 
sult in the 'control series' is signifi- 
cant in itself . . . [so there could be a] 
back-effect on the subject from the 
placement series" (Forwald, 1957, 
p. 116). 

Forwald then moved on to Duke 
for a "confirmation" (Pratt & For- 
wald, 1958). Fourteen series were 
carried out under the general re- 
sponsibility of Pratt with newly con- 
structed apparatus. Nothing was 
said of tests for side bias and no con- 
trol tests were reported. Five releases 
constituted the experimental unit, 
with successive wishes for the two 
areas. Only the B-A differences were 
reported as means for Throw 1 for 
each unit of five trials and for all 
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trials combined. On the first two 
series alone, the mean scores were 
insignificant; only the score for 
Throw 1 (first 300 trials) reached the 
.01 criterion. In the next section of 
two more series, with P. M., as pas- 
sive observer and second recorder“ 
(p. 10), the scores were insignificant 
for the total of 600 cube releases and 
it was concluded that the observer 
"seemed to upset the scores" (p. 9). 

A subsequent test series with PM, 
Forwald acting as S, was positive: on 
all throws, p=.010 and on first 
throws, p =. 006. Compared to the 
other staff members, PM apparently 
entered more fully into the spirit of 
the test when she was cosubject, and 
Forwald (Pratt & Forwald, 1958) 
"felt he was able to concentrate upon 
the task to the same degree as when 
working alone“ (p. 10). The results 
on five overlapping series each with a 
different S were undistinguished. 

Because of the success of the 
team" (p. 10), two final series were 
obtained with PM. Both Forwald 
and PM recorded independently. 
The mean scores were insignificant. 
But the means for the First Throws 
in the penultimate series "is the 
largest difference obtained in the 
work at Duke University and is not 
equaled by any of H. F.'s previous 
results" (Pratt & Forwald, 1958, P. 
12). The comparable mean for first 
throws on the final series was not 
significant, but the two pooled gave à 
p of .0002. In terms of planned ex- 
perimentation, this study was free- 
wheeling, with no predetermined de- 
sign and no recorders who were 18- 
norant of the wished-for area. In ad- 
dition, the tests were terminated, 
"because H. F. had to return to 
Sweden at that time" (p. 12). „The 
third McDougall Award for distin- 
guished work in parapsychology was 
given to Forwald in 1959 by the Duke 


irapsychological Laboratory for this 
article jointly published with Pratt. 
— Regardless of the debate concern- 
— ing Forwald's results (cf. Forwald, 
M 1954c; Nash & Forwald, 1956; Soal, 
— 1954), in the entire series there 
wasn’t a single well-designed and 
controlled experiment. Photographic 
recording and objective measure- 
ments would be the least require- 
— ment. Even in the last study at Duke 
under the manifest supervision of 
Pratt, the several series involved 
optional stopping, poor recording 
technics, and obvious lack of super- 
Vision. This undoubtedly constitutes 
the longest streak of reported suc- 
cesses (195 1-61) in all of the history 
of the movement and exceeds that of 
any of Rhine's or Soal's previous Ss. 
* And it is pertinent to ask, as Nicol 
(1954) does with respect to Forwald, 
"At what stage in the difficult history 
of psychical research it became per- 
missible for sensitives to report their 
Own results and expect them to be 
accepted as serious evidence in psy- 
chical research, I do not know. The 
number of such reports has grown 
disturbingly in recent years” (p. 355). 
Similarly, Thouless (1945a, 1945b, 
1951) has been criticized for solo, un- 
_ Witnessed efforts (West, 1954a). 
Vet Forwald's work began some 9 
Years after the publication of "ESP- 
60" (Rhine et al., 1940) which with 
and Bateman (1954) are held 
-— forth by some as the two most con- 
Vincing documents in the field. By 
the standards of the 1940 Duke 
.. ESP-60 all of Forwald's (1951-1961) 
data would have been unacceptable. 
And these very standards are contra- 
dicted by Rhine (1951) when, in 
Publishing Forwald’s first data, he 
There is of course justification 
the solo type of research in para- 
chology since all other sciences 
it and owe a great deal to it" (p. 
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50). It should be explicitly clear that 
the solo effort in scientific endeavor 
is indeed an honorable practice, but 
only if the conventional scientific 
criteria are applied; e.g., replication 
of reported results. 


DECLINES 


The case is made for PK on the 
basis of two criteria: an excess num- 
ber of hits, and/or a significant de- 
cline in the number of hits. The first 
statement constituted the hypothesis 
for the tests carried out during the 
initial period at the Duke Laboratory 
under Rhine's leadership. As noted 
earlier, for Rhine (1946b) the basis 
for the PK hypothesis was the con- 
fidence that players often have that 
"they can when ‘hot’ actually influ- 
ence the dice to some extent to follow 
their desires without the use of trick- 
ery of any kind” (p. 7). The second 
criterion—the decline hypothesis— 
was an unexpected result of a post- 
mortem analysis almost exclusively 
of the original studies of the first 
period (Table 1) and some reports 
published shortly afterwards. The 
position was given in three papers 
(Rhine & Humphrey, 1944a, 1944b; 
Rhine, Humphrey, & Pratt, 1945). 
McConnell et al. (1955) agreeing with 
this approach, specifically states that 
“From an operational point of view, 
one might say that the psychokinetic 
effect is those two effects when they 
appear without a known physical 
mechanism to explain their occur- 
rence" (p. 269). 

The declines, or "data structure 
effects" (McConnell, 1957, p. 134), 
are detected by some unit of the 
record sheet of successive segments of 
trials. A variety of analyses have 
been made and McConnell admits 
that some methods of decline analysis 
“have been tailored to fit the in- 
dividual experiment" (McConnell et 
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al., 1955, p. 272). For him, the most 
suitable is that of quarter distribu- 
tion of the page. 

Essentially, the decline test was 
derived as follows. Assume that 50 
standard runs (ie., each 24 die 
throws) are recorded in successive 
vertical columns, left to right. If the 
page is now divided into halves hori- 
zontally and vertically, the four 
quadrants are such that the first 
halves of the first 25 runs are located 
in the upper left (first) quadrant and 
the last halves of the last 25 runs are 
recorded in the lower right (fourth) 
quadrant. The lower left (second) 
and the upper right (third) quadrants 
would contain, respectively, the last 
half of the first 25 runs and the first 
half of the second 25 runs. Pooling 
the scores of the respective quadrants 
for a number of studies, the reported 
distribution was such that there was 
a marked decline from Oi (upper left) 
to Qi (lower right) quadrants. In the 
ideal situation there would be a de- 
creasing number of hits, progres- 
sively from Oi to Qs. Instead of the 
QD (i.e., Quarterly Decline) of the 
page, the test could be applied to the 
half page or set (group of runs with- 
out a pause) It was this pooled 
difference between Ci and O (i.e., 
decline in hits) which became the 
hypothesis for the second criterion of 
PK. 

Since the decline hypothesis was 
derived largely from the results ob- 
tained with the Early Dice tests, 
these data will be considered separate 
from the Later Dice tests. (The re- 
sults obtained with objects other 
than dice, Table 3, and placement 
series, Table 4, are not suitable to test 
for the lawful declines attributed to 
the dice scores.) This treatment is 
especially called for with respect to 
the decline hypothesis. First, Rhine 
makes the case largely on the basis of 
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the early tests. Secondly, since the 
hypothesis was derived from these 
data, independent evidence is re- 
quired for its confirmation. The 
question is, then, the extent to which 
these declines occur in the early dice 
tests and the extent to which support- 
ing evidence is forthcoming in the 
subsequent tests. 


Early Dice Tests 


The prior analysis in terms of tar- 

get scores readily suggests that the 
disclosed weaknesses in experimental 
design apply with equal force to the 
lawful declines in scoring. Addi- 
tionally, one can anticipate that 
these data would not be suitable for a 
test of the post hoc decline hypothesis 
since the experiments were not de- 
signed for this purpose. And as Pratt 
(1947b) has since noted : 
There are two reasons why the number of 
analyses made in the earlier investigations to 
discover possible psychological factors under- 
lying hit pattern cannot be made. . . the tar- 
gets were not randomly distributed in the 
order in which they were used on the record 
page [and] no one was interested in position 
effects at the time the tests were made, and 
the order of targets on the page had no rele- 
vance for the objective of the experiment. The 
sole purpose was to see if a significant total 
score could be obtained under carefully 
guarded conditions which excluded the possi 
bility of dice bias (p. 198). 

Perhaps the most extensive control 
test of “wish versus no wish” is the 
Frick 60-Dice solo (Rhine & Hum- 
phrey, 1945a). As previously noted 
(Table 1) there was a total of +582 
hits with six-face as target on 2,172 
runs and +576 hits on an equal num- 
ber of runs when wishing for 1’s (or 
wishing for 6's not to appear). 

Both Frick Series A (experimental 
wishing for 6's) and B (control) are 
included in the QD analysis (Rhine & 
Humphrey, 19442). According to this 
report the QD decline on the control 
series was typical and considerably 
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larger than when wishing for 6's. 
“Both series, concluded H. L. [rick], 
must therefore have been due to im- 
perfections in the dice favoring the 
six-face" (p. 47). Not so, says Rhine, 
for the control series gave a “normal 
QD pattern—one more typical, in 
fact, than that of the original Sixty- 
Dice Series!" From this it was con- 
cluded that in PK as in ESP, “the 
essential mental act is unconscious" 
(p.48). This interpretation can be 
compared with Frick's own view that 
the "dice are crooked" (Rhine & 
Humphrey, 19452, p. 205). 

Among the other early tests, there 
is to be found no other truly control 
test. In the large Gibson series 
(Gibson et al., 1943, p. 234), a block 
of 1,944 runs was performed with 
targets equalized. A QD analysis for 
this block was not presented. Indeed, 
à very few unwilled control tests were 
made, and no effort was made to test 
them for position effects. When Pratt 
(1947a) reported a nonwishing con- 
trol test, which demonstrated high- 
face dice bias, no analysis was in- 
corporated with respect to declines. 
Thus lacking control tests, the con- 
sequent dependency upon the Prob- 
ability Model demands a rigorous 
control of the known physical condi- 
tions; e.g., random selection of target 
designations, equal representation of 
all faces as targets. 

In the early tests, dice bias was 
dearly manifest, confirming Weldon's 
much earlier evidence. In these tests, 

S were more or less free to choose 
their targets, which largely were 
preferences for high faces. Suppose, 
or example, there is a six-face bias 
or given dice. If in addition, six- 
m Is chosen largely in the first part 
ins Series and other faces are 
ipe as targets later in the test, 

€ situation is ideally designed 
or a decline in hits to occur. 
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In the Gibson “cage” series (Gib- 
son et al., 1944) Faces 4, 5, and 6 were 
used as targets much more frequently 
on the left side of the page and Faces 
1, 2, and 3 proportionately more on 
the right side of the record page. By 
examining the raw data, kindly made 
available to him by Betty Humphrey, 
Parsons’ analysis clearly demon- 
strated that if all faces had been 
used with equal frequency, there 
would have been an incline, rather 
than a decline in hits. And an ex- 
amination of a total of 10 studies in 
which all faces were represented as 
targets indicates ‘‘no sound evidence 
that a QD decline occurs on all six 
faces.“ In at least seven important 
American series, the scores were 
markedly bunched up towards the 
start of the page. In the first and 
only successful ASPR study, more 
than one-third of the entire positive 
deviation obtained in the whole ex- 
periment comes from the first half of 
the first run" on each record sheet 
(Dale, 1946, p. 137). This is also of 
some interest with respect to For- 
wald's placement series in which the 
criterion became the first trial score. 
Such a degree of bunching of the 
anomalous scores into just those trials 
where physical conditions were likely 
to be unsteadiest is strongly sugges- 
tive of a physical condition. 

The evidence thus strongly sug- 
gests that the position effect most 
likely was an attribute of dice bias, 
ie., associated mainly with high 
faces. Of equal significance is the 
lack of correlation between hits and 
declines. The only evidence offered 
for a correlation of position effects 
with hits was found in Pratt's (1947b) 
analysis of the Gibson Machine Series 
(Gibson et al., 1944) which he admits 
“can only be taken as suggestive of 
an increase in the prominence of po- 
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sition effects as the total score goes 
up” (p. 202). 


Later Dice Tests 


It is to be noted that, as with the 
early tests, seldom were positive re- 
sults obtained with respect to both 
criteria: number of hits and decline in 
scores. In the first and only positive 
results from the ASPR by Dale 
(1946), with an excess of hits re- 
ported, there was no significant 
change in scoring rate. On the other 
hand, in the study by McConnell 
et al. (1955), a small but significant 
QD was reported although the total 
number of hits was within chance 
expectancy. In a number of studies, 
declines of one sort or another have 
been noted, but did not conform to 
the lawful decline. 

Of the later tests, only one de- 
serves attention in terms of the QD 
hypothesis. Overlapping in time with 
McConnell's (1955) night series, the 
more extensive day series was carried 
out in 1948-50 with 393 volunteer Ss 
(McConnell et al., 1955). For one- 
third of the trials, one pair of dice 
was cup thrown and for the re- 
mainder another pair was machine 
rotated. It is not clear whether the 
machine dice were the same as those 
used for the night (McConnell, 1955) 
series. Targets given by E were ac- 
cepted by 281 Ss and 112 Ss chose 
their own. It was reported that the 
results of the 383 Ss (each of whom 
had a single target) were essentially 
no different from those obtained with 
the 7 Ss who were permitted to 
change targets during the test ses- 
sion. M, S, and P as Es, respectively, 
recorded for 173 Ss, 132 Ss, and 88 
Ss, making hand records of the scores 

which were later matched against 
photographic recordings of the ma- 
chine throws. A sample of 50% of the 
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photographic recordings was incorpo- 
rated in the present brief report. 

As reported (McConnell et al., 
1955), ‘‘the first of the two statistical 
effects constituting psychokinesis 
was not found" (p. 271), the devia- 
tion being negligible (—91 hits). 
For the authors, the half page, 
rather than the page, is the natural 
psychological unit in the present ex- 
periment” (p. 273). The decline 
analysis, incorporating two-thirds of 
all the trials (56,592 initial versus 
equal number of final trials), indi- 
dicated a significant decrease of 
218 hits (CR=2.72). Accepting 
McConnell’s statement on face value, 
it is nonetheless still not entirely 
satisfactory that the targets used 
“form an essentially random sequence 
—with one exception: some target 
numbers were used more often than 
others“ (p. 274). 

Humphreys (1956), a skeptic, has 
emphasized that the two interac- 
tions involving halves of the page, 
which signified a break in the experi- 
mental procedure, are smaller than 
one would expect to obtain by 
chance“ (p. 291) and he concluded 
that the phenomenon was not 4 
genuine psychological effect because, 
with experimental conditions held 
constant, significant individual dif- 
ferences were lacking. In his rebuttal, 
in which the disputed interactions 
were judged to be of chance origin, 
McConnell (1958) raised another 
issue in which he argued that Hum- 
phreys had no right to assume that 
the psychokinetic effect, if real, was caused by 
the 393 subjects whom we tested. In our paper 
we said “The procedure was one in which the 
experimenter was present and aware of the 
target number" ... the decline effect may 
have been partially or entirely caused by thé 
experimenters, Thus the model which Dr. 
Humphreys uses relates to a hypothesis whic 
we could not claim to have tested (pp. 214f.). 


The mean scores for the Ss for each 


PSYCHOKINESIS (PK) 


of the Es were extremely alike. What 
seems of equal importance is that the 
variance of the groups of Ss for the 
three Es was markedly different. 
Least, and insignificant, decline oc- 


curred for the 173 Ss tested by 


McConnell. When his two assistants 
were Es, there were reportedly sig- 
nificant declines in hits. The main 
source of significance in initial versus 
final decline was the data of the 132 
Ss tested by Snowdon. Concerning 
the possible role of the E, a difference 
arises from a comparison with the 
overlapping night (sleep) series. In 
the latter the results with McConnell, 
as S, were reportedly positive; where- 
as results with Snowdon as S (long 
distance test while in Peru) were 
insignificant (McConnell, 1955). 

Humphreys’ (1956) suggestion 
“that he would like to see an inde- 
pendent replication of the experi- 
ment“ (p. 290) brings a nod from 
McConnell (1958) who says, how- 
ever, "not in simple replication, for it 
must be remembered that our work is 
already a validation of similar work 
by others" (p. 215). Yet a replication 
in 1951-52 of the McConnell day 
Series with the same Duke apparatus 
by Dale and Woodruff (Murphy, 
1952b) was completely negative with 
respect to hits. This series of 68,208 
die recordings with 108 Ss, was also 
equally negative with respect to de- 
clines in scoring.‘ 

It cannot be overlooked that with 
better control of conditions, in the 
Succeeding tests with dice throwing, 
absolute differences were attenuated 
until they became insignificant and 
declines became smaller. If one 
argues on an inferential basis, the 
residuals were most likely statistical 
artifacts as a consequence of minor 
Madequate experimental conditions. 
Fe McConnell recognizes, the decline 

€ct in his data was small and under 
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such circumstances “Chance can 
never be ruled out as a possibility no 
matter what level of significance is 
accepted“ (Humphreys, 1956, p. 291). 


DISCUSSION 


For the beginning of the scientific 
controversy concerning ‘‘psychical”’ 
phenomena, a convenient reference 
point is that of the formation of the 
Society of Psychical Research in 
London in 1882 and the subsequent 
organization of the American Society. 
Attention was largely devoted to the 
observational study of spontaneous 
phenomena, such as reported premo- 
nitions, clairvoyant and telepathic 
occurrences, and telekinesis. 

For the young psychologist, and 
most laymen, the story begins with 
the publication of "ESP" (Rhine, 
1934), a summary of the first Duke 
card-guessing tests. (Many are sur- 
prised to discover that card tests were 
carried out long before Rhine's efforts 
were initiated at the Duke Parapsy- 
chology Laboratory.) This volume 
was characterized as one of the most 
important contributions to psychical 
research yet published" (Murphy, 
1934, p. 454) and Murphy predicted 
that the twenty-fifth century will 
give Rhine's experiments the impor- 
tance they deserve in the history of 
science" (p. 457). 

With this publication by Rhine, 
the controversy was refocused not 
only between Believers and Skeptics 
(cf. Kennedy, 1939; Wolfle, 1938), 
but also among the paramutual inter- 
ests as well. The old school British 
workers, associated with the SPR for 
50 years, were challenged by ESP 
(Rhine, 1934) reporting a large 
number of persons who, it was 
claimed, had guessed consistently 
above chance expectations” (Soal, 
1947, p. 25). It was particularly 
disturbing to the British workers, for 
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whom the finding of a sensitive was a 
rare event, to face the report that “at 
least nine or ten gifted subjects were 
discovered within the small com- 
munity of Duke University" (Soal, 
1947, p. 25). The older controversy 
needs no retelling here, but the same 
situation obtained with the subse- 
quent PK tests: first reports by Rhine 
of many successful Ss subsequently 
lacked confirmation by British 
workers, and the magnitude of the 
scores was reduced under more care- 
fully controlled conditions. The at- 
tacks from the skeptics were no 
stronger than the pertinent and 
pointed within-group criticisms. 
Rhine admits that Soal has been 
among his most severe critics (Soal & 
Bowden, 1959). Here it is to be noted, 
the demand for replication is con- 
sidered reasonable: Positive results 
in card guessing and dice throwing 
have been reported in America on a 
scale for which there is no parallel [in 
England] If the American claims 
are genuine we should be forced to 
assume that the psychic faculty is 
extremely rare in England compared 
to America" (Soal, 1948, p. 183). 

With respect to PK, the evidence 
is strikingly clear that to a large ex- 
tent the earlier PK tests were poorly 
designed and badly executed. There 
is complete justification for the judg- 
ment that “the Duke experimenters 
seem to have fallen into pitfalls that 
an intelligent school boy should have 
avoided" (Soal, 1948, p. 185). As 
examination of the data discloses, by 
the simplest criterion of mere replica- 
tion of test conditions, many of the 
tests fall by the wayside. For ex- 
ample, it was noted that time and 
again Ss were free to choose target 
face, which is "particularly unsatis- 
factory," that equal numbers of trials 
were not obtained with the several 
target faces, that the use of a hand 
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shaken container in a majority of 
these Duke pioneer reports was not 
completely fool proof. With the 
motor driven dice machine (eg. 
Gibson et al., 1944; Price & Rhine, 
1944), the extent to which the speed 
of the die caster was adjusted to suit 
the S was not reported (cf. West, 
1945). 

The difficulties raised by an experi- 
mental design resting upon theoreti- 
cal rather than empirical probabilities 
need no detailed treatment with re- 
spect to PK. Dice bias was demon- 
strated long ago in Weldon's long 
series (Pearson, 1900) and presum- 
ably can be avoided by using "true" 
dice and spaced replacement (Scarne, 
1956). Whatever else is disclosed by 
the entire series of PK efforts, in its 
most favorable light, the high scores 
in hits are attributable to dice bias. 
The only support for the hypothesis 
of decline in scoring is to be found in 
the study by McConnell et al. (1955). 
The latter effect, however, was not 
confirmed in a subsequent repetition 
with the same apparatus (Murphy, 
1952b). The inherent danger in de- 
pending upon the Probability Model 
is illustrated by Oram's experiment 
of “random” selection of numbers 
from Kendall and Smith's tables 
(Oram, 1954) which provides the 
most significant single QD quarterly 
decline in annals of psychical te- 
search” (Nicol, 1955, p. 80). The use 
of a rigid mechanical system such as 
a rotating dice cage (Gibson et al., 
1944; Price & Rhine, 1944), especially 
with the same dice, gives noassurances 
of random distribution of dice faces 
(cf. Brown, 1957). In fact, West 
(1945) warned that there is the 
danger of turning out “repetitions 
that are not properly random (p. 
286). This is especially pertinent 
when the declines are small. Aga? 
the need of a controlled experiment 


proper corrective is clearly in- 
: a test of declines without 
g is still to be carried out. 

the light of Weldon's series, it 
S inexcusable to have permitted 
inrestrained use of high faces or 
almost as the sole targets in the 
ited one-half million throws—even 


it was felt necessary to preserve 
game-like environment. To date 
1), there is no published study 
th has thoroughly incorporated 
latin square design for target as- 
iment to control for dice bias. 
Additionally, failure to insure ac- 
I of scores by photographic re- 
ing or by independent observers 
ant of target designations intro- 
irreparable weaknesses. The 
ces of recording errors are many 
motivation may determine the 
of error committed. Thus, 
dy and Uphoff (1939), skeptics, 
given such evidence in regard to 
d guessing studies by showing dif- 
es in the type of error depend- 
upon whether S was a Believer or 
believer. Kaufman and Shef- 
also skeptics, reported similar 
ings with respect to dice throwing 
Believers tending to make errors 
vor of PK whereas Disbelievers 
to make errors in the opposite 
tion (Anonymous, 1952). 
S of course is one of the basic 
s for avoiding solo efforts, let 
depending upon the observa- 
for untrained individuals. The 
tance of this consideration for 
hic research was emphasized by 
bers of the London Society at 
turn of the century, and many 
hologists since (e.g., Coover, 
Kennedy, 1939). One would 
it no longer to be an issue. 
the original dice tests at Duke 
the continued solo efforts of 
ald are no encouragement at all 
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in this respect (cf. Forwald, 1959, 
1961). 

As better controls were intro- 
duced, the high scoring was reduced. 
The later, more careful, studies of 
PK showed only chance deviations in 
hits. And it is not relevant to argue 
that there is a statistically significant 
decline in scoring. Rhine's (1946b) 
statement, that "The best controls on 
the faulty dice hypothesis . . . consist 
of significant differences in scoring 
rate due to certain effects of position 
of the trial in the test sequence" 
(p. 9), is simply a profession of strong 
belief, the only effect of which is self- 
convincing. The post hoc interpreta- 
tions by no stretch of the imagination 
constitute "the best controls on the 
faulty dice hypothesis" nor rectify 
the initial weaknesses in experimental 
design. It is simply a new hypothesis 
requiring new confirmation. Since 
the QD papers, the better designed 
subsequent tests generally have been 
negative in outcome, the only excep- 
tion consisting of the preliminary 
report by McConnell et al. (1955) 
which, as already noted, lacked sub- 


sequent confirmation. 
tests clearly de- 


sis, appropriate post hoc interpreta- 
tions are offered; the QD of 
control series was “typical 
experimental series was not. The 
scores in both series were practically 
identical and highly significant. And 
this in a skeptical S. Although the 
view is maintained that a skeptical 
attitude interferes with performance, 
Frick’s results, notwithstanding, are 
accepted by Rhine as evidence for 
PK because of these position effects. 
Likewise, McConnell (198) aS 
sted a t hoc interpretation 
MES: his Peston had not been 
designed to test. In. replying to 
Humphreys’ (1956) criticism of a 
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lack of significant individual differ- 
ences, McConnell (McConnell et al., 
1955) argued that “the procedure 
was one in which the experimenter 
was present and aware of the target 
number. Thus, the world ‘subject’ 
is used provisionally or in a nominal 
sense“ (p. 270) and “we cannot rule 
out the possibility that the decline 
effect may have been partially or en- 
tirely caused by the experimenters" 
(McConnell, 1958, pp. 214 f.). A test 
of this question would have required 
that E not be aware of target designa- 
tions. In the nonwishing control 
series by Pratt (1947a)—which con- 
firmed high-face bias in excavated 
dice—he goes so far as to say that: 
In such a control series there is always present, 
however, the theoretical possibility that the 
thrower is influencing the dice by PK whether 
he is consciously setting himself to do so or 
not. This causes no difficulty so long as the 
research is concerned with the problem of 
evidence that PK occurs, for all that matters 
is whether the difference is significant (p. 56). 
This is an application of the Probabil- 
ity Model with a vengeance but, of 
course, the converse is equally ap- 
plicable: no matter how intense the 
wishing, what assurance is there that 
the results are in fact due to wishing? 
The only solution of this difficulty 
would be provided by the inclusion of 
a control series. Should Pratt's view 
be taken seriously, then of course no 
controlled psychological experimental 
test of PK is possible. 

The fact of the matter is that the 
PK data largely are unrelated to 
psychological problems. In the early 
and ancient history of telekinesis, re- 
ports are to be found of individuals 
"willing" objects to rise, as well as 
other illustrations of mind over mat- 
ter. These reports, mythological or 
not, do suggest psychological prob- 
lems which are experimentally test- 
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able. And the use of dice, because of 
the game-like situation to evoke in- 
terest, is entirely proper. There have 
been suggestions, published and 
otherwise, that a better test would be 
to determine whether a sensitive 
could will the movement of a delicate 
wire: Unlike any other force of 
which we have any experience it 
[PK] is more successful in influencing 
96 dice thrown together than a single 
die... [yet] it is incapable of mov- 
ing a delicately suspended needle" 
(Soal, 1948, p. 185). No reports of 
such tests have originated from the 
Duke Laboratory but the British 
tests, published and unpublished, 
were uniformly negative (also cf. 
Carrington, 1938). 

Few of the PK reports fulfill the 
basic requirement of a psychological 
experiment. To have psychological 
justification, there must be a con- 
trolled comparison such as wish for 
versus wish against, or wish for 
versus no wishing, or Believers ver- 
sus Disbelievers. This is but the 
simplest of considerations. The few 
studies which involved such a control 
test were negative (e.g, Van de 
Castle, 1958) and the interpretations 
varied from ignoring this outcome to 
gratuitous explanation“ that the 
process was unconscious. As 4 
hypothesis to be tested, this alterna- 
tive is also properly acceptable. But 
nowhere were tests made of Ss versus 
no Ss; this hypothesis requires a con- 
trol series without wishing, i. e., nO 
knowledge by any one of the targets. 
Evidence of PK as a psychologica 
phenomenon is therefore totally lack- 
ing. And this deficiency will persist 
until the effect is produced in the 
presence of a specified psychologica 
variable, and the effect does not ap- 
pear in its absence. 
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PROBLEM 


Whereas unrotated factors are 


—mere mathematical entities, peculiar 


to a particular matrix, many psychol- 
Ogists contend that factors rotated to 
simple structure correspond to either 
dimensions or causal influences which 
have a real existence in the sense that 
they can be recognized independently 
by other approaches (Cattell, 1952). 
However, no general agreement or 
knowledge exists as to what range of 
scientific entities the factors may rep- 
resent. For example, are factors only 
broader concepts into which measure- 
ments can be resolved; or are they 
Organizing influences within more 
varied expressions of phenomena; or 
are they causal influences in the gen- 
eral scientific sense? 

Although it is possible to answer 
thisquestion on philosophicalgrounds 
up to a certain degree, we can be sure 
that the matter is being properly 
answered only when we have empiri- 
cal data from a large number of in- 
Stances. A major reason why psy- 
chologists lack dependable informa- 
tion with which to answer questions 
of this kind is that they continue to 


use factor analysis on psychological 


data in which they do not know be- 
forehand what influences are really at 
Work. 

Even working with a known struc- 
ture will not throw light on these 
aspects of factor analysis if it is too 
remote from scientific considerations 


in which factor analysis is commonly 
m1 


$ 


applied. For example, a completely 


artificial model where correlations 
are calculated by back multiplication 
from a postulated structure has some 
place in explaining what factor analy- 
sis does. But this easy model can at 
best only return us the mathematical 
structure which was written down. 
Somewhat more concrete examples, 
like Thurstone’s box problem (1947, 
pp. 140-144) (if, indeed, such an ex- 
periment were to be done on real 
boxes!) and the bottle experiment 
(Barlow & Burt, 1954) have the ad- 
vantage that real experimental errors 
of measurement can be included in 
the analysis. Both of these, however, 
involve static relationships in which 
the factors are all limited to the class 
of physical or spatial dimensions. 
They thus fail to simulate the diverse 
properties and interactive qualities of 
influences in the psychological situa- 
tion. Their principle contribution 
seems to have been to show that fac- 
tor analysis by means of the simpler 
linear approximation can produce 
tolerably clear statements of relations 
and structures that are known to be 
mathematically complex. 

The senior author over the past 
decade has, therefore, set up several 
examples which approach the ideal of 
being: (a) organic, in the sense of 
dealing with vital, chemical, or be- 
havioral measurements; (b) inclusive 
of experimental error, in a natural 
measurement situation; (c) not arti- 
ficially brought by selection or other 
means to the mathematician’s arti- 
ficiality of orthogonality of factor 
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relationships; and (d) of incontro- 
vertably known factor structure before 
factor analysis begins. These have 
included a study on the growth of 400 
tomato plants under different condi- 
tions of two factors, hydroponic feed- 
ing and light; two studies of 100 cups 
of coffee (Cattel & Sullivan, 1962); 
and, with the second author, the 
study of 80 balls which shortly will 
be described. A survey of experience 
in laboratories and the graduate de- 
partment shows that it is through a 
lack of an available standard, real 
example that many factor analytic 
procedures and stastical tests fail to 
achieve positive evaluation and re- 
vision. 

The ball study, like the other real 
examples mentioned above, provides 
the research worker with a known 
structure upon which various factor 
analytic practices can be tried. For 
example, the study may be used to 
test a proposal for completeness of 
factor extraction and for studying 
the various ways of estimating com- 
munalities; since the number of fac- 
tors is definite and known. Again it 
may be used for evaluating kinds of 
approximations to factor solutions, 
for the relative magnitude and the 
sign of the loadings is known with 
reasonable accuracy. It may also be 
used to test procedures for rotation to 
simple structure, such as new analytic 
techniques, since the correlations be- 
tween the primary factors were con- 
trolled. 

With controlled relations among 
the factors, the ball study is enlight- 
ening in resolving the important issue 
of orthogonal versus oblique factor 
rotations. It is with this issue that 
the present article is primarily con- 

cerned. 

The question of orthogonal versus 
oblique factor axes seemed to the 
senior author to be answered 20 years 
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ago by those same logical and psycho- 
logical considerations as convinced 
Thurstone (1947) that oblique factors 
(including orthogonal factors only as 
a special case) could alone expect 
to correspond to scientific entities. 
These considerations are: 

1. It is unreasonable to expect that 
a great variety of influences operating 
and interacting in the same universe 
would be completely uncorrelated. 

2. The observation that in most 
examples where one in fact knows the 
structure, it is perfectly obvious that 
the factors (even if uncorrelated in 
the population) will tend to be cor- 
related in the given sample in which 
they have to be discovered (e. g., 
length and height in a population of 
boxes). 

3. The experimental finding (Cat- 
tell & Warburton, 1961) that second 
order factors, obtainable only through 
the obliqueness of primary factors, 
tend to reproduce themselves, in 
different matrices, with considerable 
consistency and good psychological 
meaning. 

4. The finding (Cattell, 1957) that 
simple structure oblique factors are 
experimentally replicable with de- 
cidedly greater constancy of pattern, 
across various samples, than are 
orthogonal factors. 

From these and other considera- 
tions Cattell concluded that the goals 
of orthogonal axes and uniquely de- 
termined simple structure are mutu- 
ally inconsistent (except by some 
coincidence). Nevertheless, it is true 
that even in recent years à minority 
of psychologists, including some 0 
high reputation, have continued to 
advocate and teach the search for 
replicable, simple structure factors 
while maintaining orthogonality 9 
representation, Moreover, they have 
advocated this procedure not as f 
approximation or as a distortion 
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justified by the wish to avoid the 
more complicated calculations ac- 
companying oblique factors, butasa 
correct scientific principle. It seems 
desirable to show clearly why free- 
dom for obliquity of rotation is es- 
sential by presenting a concrete il- 
lustration, the ball study. 


BALL STUDY 


Out of more than 150 balls pur- 
chased in local stores, 80 were selected 
to represent the typical range of 
qualities in balls as the term is under- 
stood. The balls ranged in size from 
about 1 inch in diameter to more than 
7.5 inches, and in weight from .1 
ounce to more than 15 ounces. When 
dropped onto a standard rebounding 
surface from a height of 28 inches, 
the range of rebound was from almost 
0 to more than 19 inches. 

The balls were selected primarily 
to cover the range of the attributes, 
size, weight, and height of rebound. 
(Henceforth, we shall use the physi- 
cist's term, elasticity, rather than the 
height of rebound, but we shall rede- 
fine elasticity to be the ratio of the 
rebound height to the initial height.) 
To get good distributions some of the 
solid wooden balls were hollowed out 
to make them lighter; other balls 
were covered with cloth or with tape 
to make them heavier and at the same 
dd to reduce their ability to re- 

ound. Included in the sample were 
En marbles, ping-pong balls, golf 
id ls, hollow rubber balls of all sizes, 
ennis balls, baseballs, softballs, cro- 


quet ball 
balls. alls, and large playground 


1 
the à more detailed description of the 80 balls, 
the Toda of measurements of the balls on 
the 1 975 an analysis of the 32 variables, 
1 ia uct-moment correlation matrix, and 
es of the equipment used in the ex- 
e ent, can be found in the study by Dick- 
(1960) listed among the references. 
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TABLE 1 


CORRELATIONS BETWEEN THE ATTRIBUTES 
DIRECTLY MEASURED 


S W. E L 


— ee 


Size 1.00 70 13 —.07 
Weight 70 1.00 —.26 —.05 
Elasticity 13 —.26 1.00 —.01 
Length —.07 —.05 —.01 1.00 


— — ʒm— uE — 


Next 80 pieces of string ranging 
from 10 to 60 inches in length were 
cut so that the lengths distributed 
normally over the range. The differ- 
ent lengths were assigned to the balls 
in such a way that length would have 
near zero correlations with size, 
weight, and height of rebound of the 
balls. The correlations between these 
four attributes are shown in Table 1. 
The strings were attached to the 
balls so that the ball suspended from 
a hook could swing like a pendulum. 

The variables for the ball study 
were designed to relate to four ball 
"traits," These were the size, weight, 
elasticity, and string length of the balls 
and will be designated by S, W, E, 
and L. 

The psychologist always selects 
tests from a particular domain of 
interest. In the ball study, this do- 
main was restricted narrowly to the 
four expected physical influences, but 
the possibility existed, as in any 
study, for unexpected factors to be 
discovered. For example, some addi- 
tional factor of surface texture, Or 
darkness of color, might have been 
found. The important point is that in 
devising variables such as might have 
interested Newton, we still left our- 
selves in rotation open to the possi- 
bility, within that domain, of an 
infinite number of resolutions unex- 
pectedly different from those derived 
by an analysis of the physical influ- 


ences. 
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Thirty-two tests were designed to 
evoke and to measure the behavior“ 
of the species of population we may 
call balls-suspended- on-strings."' 
The first four items were the diam- 
eter, weight, height of rebound, and 
length of string which with ‘‘back- 
stage" knowledge one would realize 
were pure factor markers for S, W, E, 
and L. The remaining 28 items may 
be characterized as dynamic, for most 
of them involve the ball, or the ball 
on its string, in motion and in inter- 
action with equipment fabricated for 
thisstudy. For example: 

Item 5. The ball is allowed to roll 
down an inclined plane, a distance of 
4 feet, and the number of ball 
rotations is counted. 

It would be possible to factor only 
the 28 variables with the four pure 
marker variables left out until factor 
resolution had been obtained. Then 
these markers could be introduced 
later as a check on identification. 
Actually, we carried out the analyses 
both with and without the four 
marker variables, but only the full 
32-variable analysis is reported here.? 
To anticipate, it may be mentioned 
that the correlations between the 
factors eventually found (see Table 
5) closely approximate those found 
between the four marker tests. In- 
deed, the correlations between the 
markers were controlled and the 
lengths of string were cut so that 
string length would be nearly orthog- 
onal to the others. 

The variety of these ball behaviors 
was considerable as can be seen from 
the list of variables below. Moreover, 
all of them allow errors of measure- 
ment to occur in a natural setting, 
not dissimilar from the psychological 


2 In a second experiment we carried out the 
procedures without the four marker variables, 
These results are almost indistinguishable 
from the present results. 
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situation. For on two successive 
trials, the ball will not behave in 
exactly the same way. The errors of 
measurement for this study, how- 
ever, were appreciably smaller than 
in the typical psychological study. 
The reliabilities, computed from two 
trials for each of the variables over 
the 80 balls, were in the range from 
.90 to almost 1.00. 


List of Test Items 


1. Diameter of the ball. 

2. Weight of the ball. 

3. Height of rebound of the ball 
when dropped from 28 inches. 

4. Length of string attached to 
the ball. 

5. Number of rotations of the ball 
in rolling down the length of the in- 
clined plane, a distance of 4 feet. 

6. Distance from the eye the ball 
must be held to cover exactly the 
black circle on the wall (experimenter 
stands 5 feet from the circle which 
has a 1-foot diameter). 

7. Diameter of the shadow cast by 
the ball when the ball is placed mid- 
way between the wall and the flash- 
light (light source is 4 feet from the 
wall). 

8. Number of rotations of the ball 
to travel the distance covered by one 
rotation of the automobile tire (28 
inch diameter). 

9. Number of squares and parts of 
squares covered by the ball if placed 
in the center of the checkerboard (i- 
inch squares). 

10. With the ball at one end of the 
fulcrum and the sliding weight placed 
so that the fulcrum is in balance, the 
distance of the weight from the bal 
is measured, 

11. After the ball rolls down the 
inclined plane and collides squarely 
with the black, cardboard box, the 
distance the box is moved from the 
foot of the plane is measured. 
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12. The ball rolls down the in- 
dined plane and strikes the paddle 
wheel causing it to revolve and the 
number of rotations is counted. 

13. The ball drops 6 inches onto 
the springboard (a wire spring keeps 
the board horizontal) and the num- 
ber of inches the board is depressed 
is measured. 

14. The ball is placed in the net 
beneath one end of the springboard 
and the sliding weight is moved to a 
position where the board is horizon- 
tal; the distance of the weight from 
the end of the board is measured. 

15. The ball is dropped 36 inches 
onto a hard surface and the number 
of rebounds greater than } inch is 
counted. 

16. The ball is dropped 36 inches 
to strike a hard surface inclined at 
22.5 degrees with the horizontal; the 
distance the ball lands away from 
the surface is measured. 

17. A croquet mallet on an axle 
swings down from a 45-degree angle 
and strikes the ball, causing it to roll 
up the inclined plane; the distance 
up the plane is measured. 

18. The ball is dropped 36 inches 
onto a three-sided piece of rebound- 
ing equipment, and the ball rebounds 
three times and rolls across the carpet 
to a stop; the distance of the ball 
from the equipment is measured. 

19. The ball is allowed to drop 
through a piece of rebounding equip- 
Ment and the ball will rebound two 
9r more times and roll across the 
Carpet; the distance of the ball from 
the equipment is measured. 

20. The ball suspended on its 
String from a hook attached to a 
m uen wheel is raised by revolving 
M. wheel; _the number of wheel 
revolutions is counted until the ball 
raised to touch the wheel. 

55 The ball is suspended on its 
ing from the top of a board with 


393 


knobs which are numbered consecu- 
tively and the string is threaded 
around the knobs; the numbered 
knob closest to the ball is recorded. 

22. The ball is hooked on its string 
behind a roller toy which is pulled 
forward causing the string to wind 
around the toy; the distance the ball 
travels before reaching the toy is 
measured. 

23. A "'Hickory-Dickory" board 
has a pulley arrangement so that 
when the ball and string is threaded 
around the pulley, as the ball is pulled 
down, a toy mouse moves up the 
board; the number at which the ball 
and mouse meet is recorded. 

24. The ball on its string is al- 
lowed to swing back and forth like a 
pendulum; the number of swings in a 
15-second interval is counted. 

25. The ball isallowed to roll down 
the inclined plane and across the 
carpet until it comes to a stop; the 
distance from the plane is measured. 

26. The ball is dropped 24 inches 
onto a hard surface inclined at an 
angle of 22.5 degrees with the hori- 
zontal, causing the ball to rebound 
and roll across the carpet; the dis- 
tance of the ball from the rebounding 
surface is measured. 

27. The ball is dropped from a 
height of 18 inches into a salad bowl 
filled to the brim with water and 
the water splashed out is collected 
and measured. j 

28. The ball rolls down a small in- 
clined channel and travels across à 
narrow board with numbered mold- 
ing strips; the number at which the 
ball comes to rest is recorded. 

29. The ball suspended on its 
string is allowed to swing down from 
an angle of 45 degrees to strike a 
solid rebounding surface; the number 
of rebounds is counted. 

30. The string attacked to the ball 
is wound around the ball's circum- 
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ference; the number of windings is 
counted. 

31. At a height of 68 inches a 
flashlight beam is directed to the 
floor beneath and the ball suspended 
on its string is hooked below the 
light source; the diameter of the ball's 
shadow cast on the floor is measured. 

32. The ball suspended on its 
string is allowed to swing down from 
a 45-degree angle to strike a long 
narrow board; the number of inches 
that the board swings is recorded. 


FACTOR ANALYSIS 


The factor pattern to be expected 
for these variables can be predicted 
with fair precision if it is assumed 
that the factors which will be ob- 
tained are in fact S, W, E, and L. 
Then equations from geometry and 
physics relating each test with the 
factors can be solved. These results 
are shown in Table 2. Of course, the 
transition from the real world of 
empirical relations to an abstract 
world of perfect spheres, perfect 
vacua, frictionless surfaces, and error- 


TABLE 2 
PREDICTED LOADING PATTERN 


No. S W E I. No. SWEL 
ihrn 050. 0 
ee e 
r q5 ng Peso 
4 d d d x 20° foro: ONR 
8-335690 90400 $212 1903 30 1 9:5X 
671536 0818020 Gi owe e 1X: 
CC 
3 N 
9 o 0 0 ISO WO c 0 
10 0X 0026 0X X 0 
11 700 X 0% 0% , 136900610 
12 % ‚‚ 0 28 ECCE XOU 
13 % X .0 .0. 20:20 mou ex 
i o-x- 0 0:30 23 8028005 
15 % 0 K 7013147 O06 DEO ES: 
1% 0 10. X. % % pO weeds 


Note.—An X indicates a factor loading expected to 
be large in value; a zero indicates a trivial loading. 
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TABLE 3 
Six LARGEST Latent Roots 


Root Size 


[L2 ed 


less measurements may not always 
be so simple. 

One problem encountered in an 
actual factor analysis is determining 
the number of factors. There are 
differing opinions as to the best pro- 
cedure to follow. For this purpose, 
we used a theorem by Guttman 
(1954) and a formula by Kaiser. 
Guttman has demonstrated that with 
unities in the diagonals, the number 
of latent roots, k, of size 1.00 or 
larger, is a lower bound for the num- 
ber of factors. This is based on his 
proof that it is not possible to find 
proper communalities (values in the 
range from zero to 1.00 such that the 
matrix remains Gramian) which will 
reduce the rank of the matrix to less 
than k. Kaiser has shown that when 
a latent root is smaller than 1.00, the 
alpha-reliability of that principal axis 
factor becomes negative. Thus, by 
combining these two dissimilar ap- 
proaches Guttman's theorem which is 
algebraic and Kaiser's formula which 
is in part statistical, k becomes both 
an upper and a lower bound for the 
number of factors. The first SIX 
largest latent roots from the cor- 
relation matrix are shown in Table 3, 
from which we concluded that there 
are essentially four factors.? 

The next step was to use Thur- 
stone's iterative procedure (1947, pP: 

The rationale for the factor analytic 
techniques used in this study is comp’ y 
outlined in Chapter 3 of the study by Dick- 
man (1960). 
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294-295) to calculate communality 
estimates for the four factors. After 
10 iterations the procedure was 
stopped. The estimates had con- 
verged to the point where they were 
stable to two and possibly three 
decimal places. The average change 
for all 32 estimates was less than 
0005 from the ninth to the tenth 
iteration. Then using these values as 
our communality estimates, we pro- 
ceeded to extract four factors by the 
centroid method (for uniformity with 
most common practice in the research 
to which our generalizations are in- 
tended to apply). 

It was immediately apparent from 
an inspection of the unrotated cen- 
troid factors! that they were not in- 
terpretable. Moreover, they do not 
match the structure determined a 
priori by considering the mathematic 
and kinematic relations of the vari- 
ables to the postulated four factors. 
Some rotation procedure is required 
for the factor loadings to make sense. 
_ Table 4 shows the results of rotat- 
ing the four centroid factors by the 
use of the Varimax criterion (Kaiser, 
1958). Varimax, an analytic proce- 
dure which aims at simple structure 
under the restriction of orthogonal- 
ity, does such a good job when the 
factors are truly orthogonal. that fur- 
ther adjustments by graphical means 
are not often required.  Unfor- 
tunately, in nature, as was pointed 
out previously, orthogonal factors are 
às uncommon as a straight tree. 

When the simple structure is 


A table containing nonrotated centroid 
factors has been deposited with the American 
Documentation Institute. Order Document 
No. 6911 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of 
Congress; Washington 25. D. C., remitting in 
advance $1.25 for microfilm or $1.25 for photo- 
copies. Make checks payable to: Chief, 
Photoduplication Service, Library of Con- 


TABLE 4 


var. 8 w E L 
e on ee 
1 m .30 46 -.9 
2 49 36 -.10 -0 
3 Dh 4 22M 901. 0 
e 03 — 1.00 
5 -.9 —.35 =- Al 08 
6 94 28 3 A0 
7 93 KT a6. -.08 
8 -.M -.M -. 07 
9 85 Et 19 = 
10 40 85 — 11 - 02 
11 AT 87 —.16 — 06 
12 5 75 0 — 05 
13 55 ‘33. -0 -0 
14 — 49 —.85 11 02 
15 .01 —.16 79 —.0² 
16 06 -.M 91. -.01 
7 -.60 —.47 E 0⁰ 
18 04... ID <08 
19 ,01 .12 87 .04 
20 —.12 —.02 02 .99 
1 11 02 .98 
2 -9 -0 03 1000 
B -9 —.01 -—.9 -10 
24 -9 - 08 =A 
25 37 35 8 210 
26 27 07 88 0-02 
27 77 "SS me OS ens 
28 “64 Aa „ —04 
20 13 08 83 12 
30 68. te tle ER 62 
31 73 15 12 _ —.46 
32 52 75 a3 20 
my qul. Jes 


oblique, an orthogonal analytical 
search program such as Quartimax or 
Varimax will usually not fit hyper- 
planes to any of the clusters of points, 
but instead, like Buridan's ass, will 
be unable to choose between them. 
The primary factors for size an 

weight happen to correlate in the 
vicinity of .70 despite the selection 
of hollow and solid balls of dense and 
not so dense materials. Figure 1 
which shows the reference vector 
structure plot for size and weight 
illustrates how impossible it is to get 
a good orthogonal fit under these 
circumstances. Here it is clear that 
if the hyperplane for weight is placed 
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Weight 


^t 
i] 
dL 
7 
^ dic 
^ I 
5 | 
9 l 
7 | 


N 
———g 


Size 


Fic. 1. Reference vector structure; size and weight (intermittent lines are hyperplanes). 


where it must obviously go, then the 
hyperplane for size cannot attain 
simple structure if placed orthogonal 
to it—or vice versa. 

Only in the case of the string 
length factor would orthogonality be 
a reasonable approximation, for it 
will be recalled that length was in 
fact controlled to be nearly orthog- 
onal to each of the other three fac- 
tors. It can be seen from an inspec- 
tion of Table 4 that, in this case only, 
the orthogonal factor is immediately 
interpretable and meaningful. How- 
ever, the marker variables as well as 
the clusters of variables associated 
with size, weight, and elasticity are 
also quite substantial in their load- 
ings on other factors and it is only 
when one considers their relative 
positions that the danger of a wrong 
interpretation appears. 

Table 5 shows the results of ro- 
tating to an oblique simple struc- 


ture. This was accomplished first by 
applying an analytic criterion, Bi- 
normamin (Kaiser & Dickman, 1959), 
which aims at oblique simple struc- 
ture, and then by making final ad- 
justments from an inspection of the 
graphs of the factor pairs. Only 
one adjustment was required. In any 
case the objectivity of a simple struc- 
ture solution does not depend on the 
extent to which a machine replaces à 
psychologist's judgment, but on an 
independent test of goodness of sim- 
ple structure. 

Simple structure as defined by 
Thurstone’s five rules (1947, p. 335) 
has a large number of near zeros, sa 
values in the interval from —.10 p 
+.10, in the rows and columns 0 
the reference vector structure. These 
near zeros define the hyperplanes 
and the intersections of the pire 
are the primary factors. In Table 
are shown the number of zeros in 
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each column for the unrotated cen- 
troid factors, for the orthogonal vari- 
max factors, and for the oblique ref- 
erence vector structure obtained by 
the combined binormamin-graphical 
method. Except for the L column 


* A near zero is counted only if the value 
divided by the square root of the test com- 
munality lies within the interval from —.10 to 
+.10. 
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which, as stated, happens to be 
orthogonal to S, W, and E, the 
oblique factors are clearly superior 
to the orthogonal ones, by this cri- 
terion of attainment of simple struc- 
ture. 

By using Bargmann's test (1955) 
for the statistical significance of sim- 
ple structure, the probability that a 
hyperplane count is a chance oc- 


TABLE 5 
OBLIQUE GRAPHICAL ROTATION 


Reference Vector Structure 


Var. S W E T; 
1 02 ET .00 .02 
2 —.03 8 03 .01 
3 955 8 8 
4 08 0 .01 .99 
5 50 0G .00 .03 
6 64 —.06 | —.01 .01 
7 .60 —.02 .01 .02 
8 „ .01 .03 
9 55 02 .05 .03 
10 —.03 63 —.08 .01 
1 — .30 75 —.02 —.04 
12 .07 E .05 —.02 
13 .03 .58 —.04 .02 
14 93 3 0 001 
15 .01 —.09 .72 21:03 
16 .09 —.18 .80 —.02 
17 —.32 —.19 .59 —.05 
18 —.02 .00 .85 .06 
19 —.14 17 .84 .02 

20 —,03 5-101 .01 .98 

21 —.04 .00 .01 .97 
22 —,02 901 .02 .99 
23 —.08 .03 01  —1.00 

24 = 06. 805 .03 | —.94 
25 .06 .24 51  —.09 
26 .10 .03 79 | —.08 

27 35 25 | —.09 .03 
28 35 .05 52 —.02 
29 —.01 .08 .78 .10 

30 3 EPOR .02 .58 

31 49 —.08 .o — 42 

32 .05 252. 15 .31 

Reference Vector Correlations 
S W E L 

S 1:00 040 170 MENS 12 

. 00 3 906 
E —.39 234 1.00 —.07 

L 12 %% „ 1400 


Primary Factor Pattern 


S W E L 
1.04 —.06 .00 .02 
—.05 1.02 —.03 .01 

.12 —.29 .88 —.05 

.05 —.05 .01 1.00 
—.84 —.10 .00 .03 
1.08 —.10 —.01 .01 
1.01 —.03 .01 .02 
—.87 —.06 .01 .03 

.92 —.03 .05 .03 
—.05 1.02 —.03 .01 
—.50 1.22 —.02 —.04 

212 .83 .05 —.02 

.05 .94 —.04 .02 

.03 21.01 05 —.01 

.02 —.15 78 —.03 

15 —.29 87 —.02 
—.54 —.31 64 —.05 
—.03 .00 .93 .06 
—.24 .28 .91 .02 
—.05 —.02 .01 .99 
—.07 .00 .01 .98 
—.03 —.02 .02 1.00 
—.13 .05 .00 —1.01 
—.10 —.08 .03 —.95 

.10 .39 56 —.09 

117; .05 86 —.03 

.59 ES! —.10 .03 

.59 .08 57 —.02 
—.02 13 85 10 
—.60 —.08 .02 .59 

.82 —.13 .00 —.43 

.08 .84 16 .31 

Primary Factor Correlations 

S W E L 
1.00 .76 Al —.11 

.76 1.00 —.05 —.06 

.21 —.05 1.00 .03 
—.11 —.06 .03 1.00 


ec TTT 


TABLE 6 
HyPERPLANE Counts 


s W E L 


Centroid 0 1 1 3 
Varimax 9 8 10 21 
Graphical 18 19 21 21 


Note. Near zeros in the reference vector structure 
columns. 


currence can be determined. If the 
probability is low, say less than .05 
or .01, then the null hypothesis can 
be rejected, and it is concluded that 
the structure is inherent in the data 
and that the rotation procedure has 
found this inherent structure. 

For this study, a hyperplane count 
of 11 or more indicates stability at 
level P «.05, and if the count is 12 or 
more, the factor column is signif- 
icantly stable at level p «.01. If the 
Bargmann test were to be rigorously 
applied to the orthogonal factors, 
one would be unable to reject the null 
hypothesis that the zeros in the S, 
W, and E columns could occur by 
chance. On the other hand, the high 
hyperplane counts of 18 and more for 
the oblique solution indicates ex- 
tremely significant results. The size 
of the critical region for a count of 18 
is approximately 5X 1077. 

Factor analysis is a procedure for 
expressing the relations of a battery 
of tests with a set of hypothetical 
factors. There are many uses for 
factor analysis, and there may be 
many situations where an investiga- 
tor may prefer to express these rela- 
tions upon a set of independent factor 
axes. In these situations, he should 
choose orthogonal axes, and Barg- 
mann's test may be inappropriate. 
But if the investigator intends to 
interpret the factors in terms of 

common variance with the test vari- 
ables, then he should aim at simple 
structure. We have presented evi- 
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dence to show that these hypothetical 
constructs, the factors, relate more 
closely to natural phenomena when 
an oblique simple structure solution 
is obtained. 


SUMMARY 


1. With the aim of throwing light 
on the nature of the concepts gen- 
erated by factor analysis, through 
testing the method on a known phys- 
ical model, some 32 properties or be- 
haviors were measured for 80 balls 
varying in size, weight, elasticity, 
and the length of string on which 
some of their "performances" were 
measured. The variables were inter- 
correlated and factored. 

2. Applying the usual procedures 
in well controlled psychological use 
of factor analysis, we found: (a) that 
four factors were indicated as the 
appropriate number, which checks 
with: the number of influences à 
physicist would posit in this situa- 
tion; (b) that a simple structure of 
high significance, by Bargmanns 
test, existed in the data and was at- 
tainable by oblique rotation but not 
by orthogonal rotation; (c) that the 
factors obtained, recognized by unit 
loadings on some measures of size, 
weight, elasticity, and string length, 
proved to be precisely the four in- 
fluences which were predicted from å 
mathematical and kinematic analysis 
of the variables. " 

3. The attempt at orthogonal sim- 
ple structure rotation failed on three 
counts: (a) it showed a tendency t9 
fit hyperplanes to none of the clusters 
of points; (b) it did not yield a 
statistically significant structure 
(Bargmann test) in the best pos! ion 
it could attain; (c) except in the case 
of string length, which happened te 
be orthogonal, the interpretation 5 
the factors cannot be so clear» 
made as in the oblique case. 


t 


The last point deserves some ex- 
nsion. In Figure 2 is shown the 
tor pattern plot of size against 
elasticity, drawn on oblique axes 
With a cosine of .21 (see Table 5). 
Evidently, there is some tendency 
for larger balls to be more elastic, 
perhaps because they are more fre- 
quently pneumatic. From this dia- 
am two summarizing points can be 
illustrated as follows. 
4. If one insists on erecting the 
s for the second factor at a right 
gle to the first, its meaning be- 
Comes a composite—a highly acci- 
dental composite—of the two in- 
fluences that have been clearly iso- 
lated by the oblique simple structure. 
us, in Figure 2 the variables close 
the S axis (but not including the 
Cluster about the origin), which are 
in meaning practically pure measures 
| size, would become, against the 


MODEL OF OBLIQUE SIMPLE STRUCTURE 


Elasticity 


Fic. 2. Primary factor pattern; size and elasticity (intermittent lines are orthogonal axes). 


orthogonal factors shown in the in- 
termittent lines, a composite struc- 
ture. Similarly, those points close to 
E would also become composite if 
related to the orthogonal lines. Forc- 
ing orthogonality thus means creat- 
ing conceptual hybrids which are 
neither one or the other. The prac- 
tice of completing psychological re- 
search by mechanically applying 
orthogonal analytic computer pro- 

ms, when there is no prior knowl- 
edge that the factors are in the rare 
orthogonal relationship, is therefore 
likely to lead to a crop of misleading 
psychological conclusions. Consider- 
able visual single plane rotational 
adjustment is necessary to complete 
such research and inspection of factor 
plots with subsequent adjustments 
may be necessary after oblique pro- 
grams such as Oblimax or Binorma- 


min are applied. 
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5. Psychologists who presume 
orthogonality sometimes say that 
they do so because, although factors 
may be oblique in the sample due to 
sampling errors, the factors must es- 
sentially be orthogonal in the popula- 
tion. The latter is a false assumption. 
There is no reason why in our in- 
teracting universe they should be 
correlated in the population; e.g., 
as age, weight, and stature factors 
are in the human population. The 
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only place in which these influences 
are uncorrelated is as abstract con- 
ceptions in the scientist's head. But 
if he wishes to obtain the purest and 
most accurate concepts of them, he 
must respect their obliquity where 
they actually occur. For if he insists 
on finding statistical independence 
for the entities which he believes 
exist, just because he thinks of them 
with conceptual independence, he 
may never find them at all. 
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A half century ago, Mott (1910) 
‘published a statistical compilation of 
pairs of relatives admitted to London 
"County Council mental hospitals and 
noted a preponderance of female 
pairs as against male pairs, and of 
'Ssame-sexed as against opposite-sexed 
pairs of relatives. Myerson (1925) 
made a similar study in Massachu- 
setts' Taunton State Hospital and 
replicated Mott's findings. Penrose 
(1942) postulated two autosomal 
genes to account for these different 
concordance rates regarding mental 
illness in the two sexes. His theoret- 
ical orientation stemmed from 
Freud’s observation that sexual in- 
version predisposed to mental illness, 
as illustrated in the case of Schreber 
(Freud, 1911). 

Penrose’s postulated Genes A and 
B were assumed to augment and ac- 
centuate femaleness and maleness, 
respectively. If a father had Gene A 
and passed it on to his son, both 
would have the inherited tendency 
to sexual inversion which would pre- 
dispose each to mental illness. If 
Gene A, however, was passed on to a 
daughter, her femininity would be 
enhanced or exaggerated, and she 
Would escape the tendency to sexual 
inversion. Similarly, if mothers 
passed on Gene B to their daughters 
both would tend toward sexual in- 
version whereas their sons inheriting 
Gene B would be exceedingly mascu- 
line and would be less prone to de- 
velop mental illness. 

E. Slater (1944) believed that higher 
1 familial concordance in same-sex 
as against opposite-sexed pairs had 
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to be accepted as a well-established 
fact, but he objected to Penrose's 
theory to account for it. He pointed 
out that expressivity and penetrance 
were influenced by genetic, bio- 
chemical, and environmental factors, 
and that it was only necessary to 
suppose that some of these factors 
were more likely to be the same for 
persons of the same sex than for 
persons of the opposite sex. How- 
ever, he made no attempt to indicate 
what these other genetic, biochem- 
ical, or environmental factors might 
be, so that his theory lacked the rela- 
tive specificity of Penrose's theory. 
Slater (1953b) subsequently con- 
sidered the possibility of a sex-linked 
recessive gene in mental illness. He 
analyzed the frequencies of same- 
sexed and opposite-sexed pairs among 
the avuncular and sibling relation- 
ships in Mott's series of cases, and 
concluded that a genetic explana- 
tion was not satisfying because ol 
inconsistencies in the frequencies 
when both kinds of familial rela- 
tionship were considered. Also, if 
mental illness were due to a sex- 
linked recessive gene, an excess of 
maternal uncle-nephew pairs should 
have been found as compared to all 
other avuncular pairs, but no such 
excess occurred. However, he still 
considered the possibility that a sex- 
linked recessive gene might be play- 
ing a causative role with respect to a 
tendency for mental deficiency and 
paranoid schizophrenia to occur in 
the same families among Mott's 


Rosenthal (1959), in his analysis 
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of Slater's (1953a) schizophrenic twin 
series, pointed out several differences 
between the sexes with respect to 
clinical features of the illness asso- 
ciated with concordance. Premorbid 
history, age of onset, deteriorative 
outcome, and subtype diagnosis dis- 
tinguished concordant from dis- 
cordant monozygotic (MZ) twins if 
they were male, but not if they were 
female. Attention was also called to 
the greater number of index cases 
and the higher concordance rate 
among female as compared to male 
MZ twins in Slater's series. 

In a subsequent paper, Rosenthal 
(1961) noted that the preponderance 
of female twins obtained in the three 
largest of the five major twin studies 
of schizophrenia resulted from sam- 
pling among resident hospital popu- 
lations. Females were more likely to 
become inhabitants of the chronic 
wards than males. It was also 
pointed out that a sample obtained 
in large part from a chronic popula- 
tion showed higher concordance with 
respect to severity of the illness than 
a sample obtained primarily through 
consecutive admissions, so that rela- 
tionships found between sex and 
concordance rates might be in- 
fluenced by the kind of sampling 
procedure used. 

Thus, Mott's findings of half a 
century ago remain without any 
satisfying explanation, and the issues 
surrounding them seem to be more 
complex than originally believed. In 
this paper, it is intended to bring to- 
gether various findings in the litera- 
ture which arerelevant to the theoret- 
ical considerations stated above, 
and to indicate the conclusions to- 
ward which these findings, in the ag- 

gregate, seem to be pointing. I shall 
attempt to determine whether higher 
familial concordance for females 
than males, and for same-sexed than 
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opposite-sexed members, obtains 
through all close blood relationships, 
I shall consider various possible ex- 
planations of these findings, trying 
to indicate the relative merits of 
genetic and environmentalist hypoth- 
eses intended to account for the 
findings. I shall focus on these mat- 
ters with respect to schizophrenia 
primarily, and shall use them to 
illuminate some obscure aspects of 
the etiology of this mental disorder. 


FiNDINGS 


The closest familial relationship, 
probably in the interpersonal as well 
as the genetic sense, is to be found 
among MZ twins. There are, of 
course, no opposite-sexed MZ twins, 
but concordance rates can be com- 
pared for MZ male and female pairs. 
Of the five major twin studies which 
aimed at statistically representa- 
tive samples, four (Essen-Móller, 
1941; Luxenburger, 1928; Rosanoff, 
Handy, Plesset, & Brush, 1934; 
Slater, 1953a) gave breakdowns of 
their MZ twin series by sex, and one 
(Kallmann, 1946) did not. All con- 
cordant and discordant MZ pairs re- 
ported in the four studies have herein 
been summed according to sex, the 
combined figures shown in Table 1. 

It can be seen from Table 1 that 
concordance rates are indeed higher 
for female than for male MZ twins. 
All four studies are in agreement on 
this point, although none reaches 
statistical significance by itself be- 
cause of the small numbers of cases: 
Indeed, the discordance rate is more 
than twice as high for the male as 
for the female MZ twins, represent- 
ing almost half of all male pairs, but 
less than one in four female pairs. 

This marked sex difference does 
not occur in Kallmann's series, al“ 
though the very slight difference 1? 
concordance rates which he reports 
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TABLE 1 


NuMBER OF MALE AND FEMALE Pains or MoNozycoric Twins CONCORDANT AND 
DISCORDANT WITH RESPECT TO SCHIZOPHRENIA 


(Four studies) 


SEX 


Male Percent 
Concordant 47 (78.3) 
Discordant 13 (21.7) 
Total 60 


Note.— x? =5.328; at p =.02, x1 55.412. 


is also in the direction of higher con- 
cordance among female MZ pairs. 

When we attempt to analyze the 
concordance rates of male and female 
dizygotic (DZ) twins, we find that 
only the series of Rosanoff et al. and 
Slater can be combined statistically. 
Kallmann does not give the actual 
number of same-sexed male and fe- 
male DZ pairs. Luxenburger in a 
later report (1930) still had not 
found any concordant DZ pairs in his 
series of 37 DZ index cases. Essen- 
Möller did not concern himself with 
or report on his opposite-sexed pairs. 
The relevant data from the Rosanoff 
and Slater studies are summarized in 
Table 2. 

It can be seen from Table 2 that 
there are only 28 concordant DZ 
pairs in both studies combined. 
When these are divided into three 
sex categories, the Ns for analysis 


Total Percent 


70 (68.6) 


are rather small. Nevertheless, it is 
clear in Slater's series that the con- 
cordance rate for female DZ pairs is 
higher than the rate for male DZ 
pairs, and that same-sexed pairs are 
more concordant than opposite-sexed 
pairs. In the Rosanoff et al. study, 
the concordance rate is higher for 
male DZ pairs than for female DZ 
pairs. There are only 11 male pairs 
in their series as compared to 42 
female and 48 opposite-sexed pairs, 
suggesting an unusual degree of 
sampling bias with respect to the 
male pairs. It will be noticed that 
the female: male ratio in the Rosanoff 
et al. study is approximately 4:1, a 
most deviant distribution, while in 
the Slater study the ratio is approxi- 


1 This is the only instance in the literature 
where, for any given blood relationship, I have 
found a higher concordance rate for male than 
female pairs with respect to schizophrenia. 


TABLE 2 
CONCORDANCE WITH RESPECT TO SCHIZOPHRENIA IN MALE AND FEMALE Dizvooric TWINS 


Rosanoff et al.^ 


Totale 


Sat Slater* TT 

3 Concordant | Discordant} Concordant | Discordant Concordant | Discordant 
Male-mal 3 (27.3%) 8 5 (15.6%) 2 
fet 3 %% T (16.7%) 35 16 (19.3% 57 


2( 
Female-female | 9 (22.5%) 31 
ale-female 2( 3.7%) 


Note.—From Slater (1951) and Rosanoff, Handy, Plesset, 
«.02. 


** 8.18 

b „ 7 

o X 72.21, p 7.30. 
* 6.50, p <.05, 


52 5 (10.4%) 43 


7( 6.9%) 95 


and Brush (1934-35). 
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TABLE 3 


INCIDENCE or Case HISTORIES SUGGESTING 
“TRAUMATIC OR INFECTIOUS ETIOLOGY” 
IN ALL Twix Paras DISCORDANT AS 
TO SCHIZOPHRENIA 


Sex of the 
; schizophrenic 
— twin Total 
Male Female 
History of 
trauma or 16 6 
infection (41.0%) | (10.095) 22 
History nega- 
tive for 
trauma or 
infection 23 54 77 
Total 39 60 99 


Note.—Data from Rosanoff et al., 1934. x? = 11.43, 
FRSUM 


mately 2:1, which also deviates 
markedly from theoretical expect- 
ancy. Some reasons for and implica- 
tions of such sampling biases have 
been discussed at length by Rosen- 
thal (1961). 

Nevertheless, the concordance rate 
is again higher for same-sexed than 
opposite-sexed pairs in the Rosanoff 
et al. study, although the difference 
does not reach statistical significance. 
Combining both studies (which at 
least provides larger Ns for each 
category, and which perhaps cancels 
out some of the sampling errors of 
each study) yields statistically signif- 
icant differences between the dif- 
ferent types of sex pairs. The con- 
cordance rate is somewhat higher for 
females than males, and the concord- 
ance rate for same-sexed pairs is 
18.3% as against 6.9% for opposite- 
sexed pairs, an appreciable dif- 
ference. 

Since twins comprise a special and 
unusual kind of familial group, we 
might pause in our investigation of 
familial concordance among the sexes 
to ask if there is any information 
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available relevant to twins which 
might account for the sex differences 
in concordance reviewed here. One 
finding worth noting was reported by 
Rosanoff et al. They examined the 
histories of all twin pairs, both MZ 
and DZ, in which only one twin was 
diagnosed schizophrenic, and ob- 
served that the proportion of affected 
cases having a “probably traumatic 
or infectious etiology" seemed to be 
higher in the males than in the fe- 
males. The relevant data are brought 
together and summarized in Table 3. 

It can be seen from Table 3 that 
clinical histories suggesting traumatic 
or infectious etiology occurred about 
four times às often among the male 
as among the female schizophrenic 
twins. The sex difference in this re- 
spect is highly significant, statis- 
tically. Hereditarily speaking, such a 
finding is consistent with the hypoth- 
esis of genetic predisposition in the 
affected twin, and one could infer 
that manifestation of the illness 
might not have occurred without the 
trauma or infection, the latter serv- 
ing as environmental-factors which 
fostered and abetted manifestation. 

The following considerations ate 
raised by such an hypothesis: 

1. Has it been shown in any study 
heretofore that physical trauma of 
infection are etiologic with respect to 
schizophrenia generally? Reviews of 
the literature relevant to this point 
indicate that, aside from a number 0 
isolated case reports, there is no evi" 
dence to support this view (Bellak 
1947; Kety, 1959; Overholser & 
Werkman, 1958). We must therefore 
question why such factors would 
relevant to twins but not to single 
born persons. Should such a finding 
be confirmed with nontwins, the hy- 
pothesis would be bolstered consider- 
ably. : 

2. Such a finding really explains 


her discordance among male pairs 
rather than higher concordance 
‘among females. In other words, the 
number of concordant pairs should be 
‘about the same for both sexes, but 
the number of discordant pairs 
should be greater among male twins, 
if the hypothesis is correct. The data 
in Tables 1 and 2 suggest that it is 
father the higher frequency of con- 
cordant pairs among females which 
is distinguishing the two sexes. There 
may however be some sampling bias 
involved here, as noted above, so 
that the hypothesis cannot be dis- 
credited with complete assurance on 
this point alone. 

3. The authors do mot mention 
Whether the frequency of trauma 
and infection was similarly or dis- 
similarly distributed among the non- 
affected twin partners. Presumably, 
they did not have this information 
Since it would be so hard to come by. 
It would be reasonable to assume 
that physical trauma, if not infec- 
tion, occurs more frequently among 
males than females generally. If this 
is true, one would suppose that 
trauma and infection occurred more 
uently among the nonaffected 
male twins than among the non- 
affected female twins. Moreover, the 
probably higher incidence of trauma 
among male twins would also imply a 
"higher incidence of schizophrenia 
among males than among females 
generally. Although there is some 
evidence to show that such a dif- 
ference does occur (Landis & Page, 
1938; Malzberg, 1935), the difference 
is not large and probably not at all 
Proportional to the difference be- 
tween the. sexes with respect to the 

uency of trauma. 

Thus, although the Rosanoff et al. 
hypothesis is a suggestive one, there 
48 little to support it and some evi- 
ence against it. Another kind of in- 
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TABLE 4 
Ixciogscg oF "Errotooicat. Pevonc 
Factors" rx Tae Curaca Histoares. 
or Mare AND Femne Discoubawt 


Scemzorurexic Twine 
—— la 
Male Female 

Psychic factors in 

evidence 6 n 
Psychic factors not in 

evidence E » 
oom e de e atm Ans 
formation relevant to the main prob- 


lem under discussion can, however, 
also be found in their study. The 
authors searched the same clinical 
histories of the affected twins among 
discordant pairs * evidence N. 
“etiological psychic factors." 
relevant data are summarized in 
Table 4. 

Table 4 shows that etiological 
psychic factors were in evidence 
among 35% of the affected female 
twins but among only 15.4% of the 
affected males. The difference falls 
barely short of the 5% level of 
significance. The authors state that 
it was especially those psychic factors 
having 1o do "— 
or love life which were more com 
in the female sex. If such “etiolog- 
ical" factors are more common among 
female twins, then they would be 
more likely to occur among both 
members of a pair of female as com- 
pared to male twins, and would 
thus make for a higher concordance 
rate for the female twins. 

Against this hypothesis, it may be 
said that we do not know about the 
distribution of these factors in the 
nonaffected twin partners. One 
could well imagine that a similar dis- 
tribution might have obtained among 
them, since lems around sex and 
the love life probably were at that 
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TABLE 5 


CONCORDANCE AS TO SCHIZOPHRENIA IN SAME- 
SEXED AND OrrosrrE-SEXED 
Dizvcoric Twins 


Concordant | Discordant 


pairs pairs Total 
Same sex 34 262 296 
Opposite sex 13 208 221 
Total 47 470 517 


Note. Data from Kallmann. 1946, x! =4.15, 9 <.05 


time (and perhaps stillare) more prev- 
alent among women than men 
(Offergeld, 1957). Moreover, one 
would normally have some reserva- 
tions about inferences regarding psy- 
chological problems when these in- 
ferences are made from hospital 
records. It may be simply that the 
women were less loath to talk about 
psychological problems, especially 
sexual ones, than were the men. 
Gross (1959) found that women ad- 
mitted to having psychopathological 
traits more freely than men, while 
men showed more conscious and un- 
conscious denial and constriction of 
behavior. Moreover, if such prob- 
lems are crucial etiologically with 
respect to schizophrenia, and if they 
occur more commonly among women 
than men, then the incidence of this 
disorder should be higher among fe- 
males. As noted above, frequency of 
hospitalization for schizophrenia in 
this country is, if anything, higher 
among males. 

In favor of the hypothesis is the 
clinical observation, so often reported 
that it is needless to cite references 
here, of morbid sexual preoccupation 
among schizophrenics of both sexes. 
Moreover, it is well established that 
good premorbid sexual adjustment 
is a favorable prognostic sign (Phil- 
lips, 1953; Wittman, 1941). But even 
if psychological disturbances in the 
erotic sphere could be proved to be 
of etiological significance in schizo- 
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phrenia, it would still be necessary 
to integrate theoretically the findings 
in Table 4 with data which suggest a 
probably higher incidence among 
males generally but a higher con- 
cordance among female twins. 

Kallmann (1946) has published 
data regarding the number of same- 
sexed and opposite-sexed DZ twin 
pairs concordant with respect to 
schizophrenia. These data are sum- 
marized in Table 5, in which it is 
again shown that the concordance 
rate is higher for the same-sexed 
pairs. 

Kallmann also stated that the con- 
cordance rate was 17.7% for female 
DZ pairs and 17.4% for male DZ 
pairs, as shown in Table 6. It is dif- 
ficult to evaluate these figures since 
there were only 34 concordant same- 
sexed DZ pairs in his series, and since 
the percentages cited represent mor- 
bidity risk estimates rather than ac- 
tual concordant pairs, as in Table 2. 
Kallmann also published concord- 
ance rates (morbidity risk  esti- 
mates) for the siblings of twin index 
cases, as shown in Table 6. When 
the sibling was of the same sex as the 
twin, the concordance rate Was 
16.195. When he was of the opposite 
sex, the rate was 12.3%. Among 
those siblings who were of the same 
sex as the twin index cases, the con- 
cordance rate was 16.3% for females 
and 15.9% for males. Although 
again we cannot infer the number 
of actual pairs, and the numbers are 
obviously small, the slight differences 
reported are in the expected direc- 
tion. In Slater’s series, which had 
about two females for every male 
index twin, the age-corrected incl 
dence of schizophrenia among sibs 
of the index cases was 3.4% for male 
sibs and 7.3% for female sibs, a find, 
ing which is again in the direction o 
higher concordance among fema es. 

Concordance in sibling pairs where 
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TABLE 6 


Monnipiry Risks AS TO SCHIZOPHRENIA IN SthLINGS AND Dizvooric Corwiws 
or TWIN INDEX Cases ACCORDING TO SEX 


—— 
Sex ol the 
twin index 

case Male 

Same sex 

Opposite sex 


All cases 


Note. Data from Kallmann. 1946. 


neither is a twin has been reported in 
tro extensive investigations by 
Schulz (1932) and Zehnder (1941). 
The relevant findings are summarized 
in Table 7. 

On the basis of sex distribution 
alone, we expect that the number of 
same-sexed pairs should be approxi- 
mately equal to the number of op- 
posite-sexed pairs. If we combine the 
figures from both studies (which are 
in general agreement on this point) 
we find 138 same-sexed pairs as 
against 85 opposite-sexed pairs. The 
discrepancy differs significantly from 
the expected equal distribution. 
With respect to male versus female 
pairs, we again find a higher number 


TABLE 7 


Sex or SIBLINGS CONCORDANT WITH 
Respect TO SCHIZOPHRENIA 
aaua 


Schulz Zehnder Total 


Sex (1932). (1940)* 
EEE 
Male-male 37 13 50 
Female-female 38 30 68 
Male-female 56 29 85 
All pairs 131 72 203 


hereas Zehnder simply 
ascertained pairs of affected sibs, without specifying 


the studies 


47 — was less certain, the figures would have been rol 
a= T 
qmd 71 for the brother, sister and brother-sister 


of female than male pairs in Zehn- 


der's sample. If we assume an ex- 
pected equality of pairs by sex, then 
the discrepancy from equality is 
statistically significant (p «.05). In 
the Schulz sample, however, equality 
appears to be almost perfect. But in 
this study, there were actually 367 
male index cases as against only 293 
female index cases in the original 
sample, or 55.6% male versus 44.4% 
female. Moreover, male index cases 
had 1079.5 sibs as compared to 880 
sibs of the female index cases, which 
would have increased the probability 
of having more brother pairs relative 
to sister pairs. But among pairs of 
siblings, the females still comprised 
over 50%. Thus, a difference of un- 
known magnitude in favor of f 

pairs occurred in the Schulz study as 
well. 

]t is relevant to point out at this 
time how sampling procedures may 
by themselves lead to such dif- 
ferences. Zehnder, for example, col- 
lected all pairs of siblings admitted to 
a single Swiss mental hospital over a 
20-year period. This sounds like à 
total sample which should therefore 
be free of biasing factors. However, 
we need only to consider that males 
are generally more migrant than 
females, and that the differential was 
probably at à peak during the early 
years of this century when most 
these cases were admitted. If two 
brothers were both schizophrenic, 
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but one had emigrated before his 
illness, this pair would not have been 
included in Zehnder's sample. Sis- 
ters, however, who would probably 
both have been relatively more 
housebound, would have found their 
way to the same hospital if both had 
become ill. 

In this regard it is interesting to 
note that there were 50 male pairs 
as compared to 85 opposite-sexed 
pairs in the Schulz and Zehnder 
studies combined. These values do 
not differ significantly from the ex- 
pected values of 45 and 90 pairs for 
the respective groups. However, the 
68 female pairs are significantly 
greater than one-half the number of 
opposite-sexed pairs. Thus, in com- 
paring same-sexed with opposite- 
sexed pairs, it is the group of female 
pairs which makes for the higher fre- 
quency of concordant siblings among 
the same-sexed pairs. The possibility 
also exists that same-sexed siblings, 
male or female, are more likely to be 
close to one another and live in the 
same area than are opposite-sexed 
siblings. Such factors alone, if 
proven correct, could account for 
the data reported in Table 7. 

With respect to research in this 
area, two points are worth making. 
The first follows directly from the 
immediately preceding discussion. In 
studies of concordance among sib- 
lings, each and every sibling in all 
families comprising the starting cases 
of the sample must be investigated. 
The same principle holds true for 
other blood relationships as well. It 
is not enough to report only known 
cases or readily available relatives. 
Of course, such a sampling ideal is in 
practice extremely difficult to 
achieve, for apparent reasons. 

This leads us to the second point, 
viz., that research on twins has an 

advantage over familial studies in 
that the sampling ideal just indicated 
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can be more readily approached, 
once one has defined his sample of 
starting cases. This is true because: 
(a) Twins are the same age. One does 
not have the problems of cases cover- 
ing two or more generations where 
the antecedent generation has been 
excessively decimated by deaths, or 
the subsequent generation has not 
yet reached or lived through an ap- 
preciable part of the morbidity risk 
period. (b) Twins are more readily 
located. This is true because twins 
are less likely than other siblings or 
relatives to be separated (which can 
of course be a disadvantage from 
another point of view), and are more 
likely to keep in touch or know of 
each other's whereabouts if they are 
separated. (c) The number and type 
of relatives to be ascertained are 
more clearly specified than in the 
case of some other familial groupings. 
An individual may have many sibs 
or none at all, but a twin always has 
just one cotwin, and only that par- 
ticular person needs to be found 
and studied. Moreover, from a gene- 
tic standpoint, the same paternity in 
the case of twins is always assured, 
but no similar guarantee obtains in 
the case of other siblings. (d) The 
representativeness of the sample is 
more easily verified since the distribu- 
tion of twins in the general popula- 
tion according to sex and zygosity is 
known and can serve as a standard 
against which the sample can be 
compared. 


CONCORDANCE BY SEX FOR PRIMARY 
AND COLLATERAL FAMILIAL 
RELATIONSHIPS 


In pursuing our inquiry further We 
may ask two questions of theoretica 
relevance: (a) Does the pattern O 
familial concordance by sex obtain 
only with the primary family gr oup; 
i.e., parent, child, and sibling, or does 
it also extend to aunts, uncles, an 
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cousins? If genetic factors are at 
work, the extension to avuncular 
relationships should be found, but to 
a lesser extent. If not, hypotheses 
emphasizing intrafamilial influences 
of a psychological nature would be 
enhanced. (b) Does the pattern of 
familial concordance by sex obtain 
for mental illness generally, or is it 
specific to those disorders called 
schizophrenic? 

To deal with these questions, we 
have recourse to large scale studies 
by Mott (1910), Myerson (1925), 
and Penrose (1945). Mott began his 
studies shortly after the turn of the 
century when he established an “He- 
redity Index," a card-filing system in 
which names, diagnoses, dates of ad- 
mission, and discharge, etc. were re- 
corded for all persons admitted to 
one of the London County Council 
Mental Hospitals who were known 
to have or to have had any other rela- 
tive as a patient in one of these hos- 
pitals. The files were continued long 
after Mott's original article in 1910, 
and Myerson (1925) and Slater 
(1953b) have reported more recent 
compilations of the filed data, these 
larger figures being cited here in- 
stead of Mott’s original figures. 
Myerson (1925) used the records of 
the Taunton State Hospital in 
Massachusetts to obtain a list of all 
Pairs of relatives who had been ad- 
mitted to that institution. For both 
the Mott and Myerson studies, the 
figures reported here include all such 
pairs, no attempt being made to 
differentiate individuals diagnos- 
tically. They are simply called men- 
tally ill to a degree warranting certi- 
fication. 

Penrose's unpublished study has 
the largest accumulation of pairs of 
relatives. His survey was based on 
records of patients admitted to all 
Ontario Hospitals. Because his Ns 
were sufficiently large, and because 


he separated out two groups of dis- 
orders which he called schizophrenic 
and affective, the figures for the two 
groups are listed separately. Under 
the schizophrenic disorders were in- 
cluded all patients who had been di- 
agnosed: schizophrenic—the simple, 
paranoid, catatonic, or hebephrenic 
subtypes; involutional paranoid or 
senile paranoid; dementia praecox; 
schizophrenic defective; arterio- 
sclerotic paranoid; and alcoholic par- 
anoid. Under the affective disorders 
were included all patients who had 
been diagnosed: manic; manic depres- 
sive; reactive depressive or psycho- 
neurotic depressive; involutional de- 
pressive or melancholic; involutional 
psychosis; manic alcoholic; manic 
psychosis with mental defect; senile 
manic or depressive; arteriosclerotic 
manic or depressive. Clearly, these 
groupings will not satisfy everyone, 
perhaps not even many. The inclu- 
sion of known “organic” cases is 
questionable and distinctions such 
as between “involutional paranoid” 
and "involutional psychosis” may 
not be easy to make. Nevertheless, 
the attempt to relate cases in which a 
cognitive disorder predominates and 
cases in which an affective or mood 
disorder predominates is reasonable 
enough and is preferable to a labeling 
anarchy which generates à large 
number of discrete, idiosyncratic 
categories. The main relevant data 
of the three studies are brought to- 
gether in Table 8. - 
With respect to the parent-child 
relationship, it can be seen in Table 8 
that there were consistently more 
mothers (601) than fathers (455) 
and more mother-daughter (339) 
than father-son (209) pairs. This was 
true for the broad classification 


2 Omitted here are all those cases diagnosed 
primarily as: arteriosclerotic, senile, paretic, 
other organic, epileptic, defective, and Hunt- 
ington's chorea. 
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TABLE 8 
THE SEXUAL DISTRIBUTION OF PAIRS OF RELATIVES CERTIFIED MENTALLY ILL 
(Three studies) 
È " 2 Both Both Male- Female- 
Relationship Study Criteria male female female* male 
Parent and child | Mott (1910)* Mentally ill 78 137 103 96 
Myerson (1925) | Mentally ill 55 80 59 56 
Penrose (1945) | Schizophrenic disorder 28 55 16 51 
Penrose (1945) | Affective disorder 48 67 68 59 
Total 209 339 246 262 
Sibling Mott (1910)* Mentally ill 140 211 212 
Myerson (1925) | Mentally ill 57 80 110 
Penrose (1945) | Schizophrenic disorder | 106 124 192 
Penrose (1945) Affective disorder 69 118 140 
Total 372 533 654 
Uncle, aunt and | Mott (1910) 4 Mentally ill 67 73 53 30 
nephew, niece | Myerson (1925) | Mentally ill 41 42 37 43 


Penrose (1945) 
Penrose (1945) 


b 


* Figures cited by Myerson (1925; 
d Figures cited by Slater (1953). 


"mentally ill," but was most striking 
in Penrose's group of schizophrenic 
disorders. In this group there were 
106 mothers versus 44 fathers, and 
twice as many mother-daughter pairs 
as father-son pairs. These data may 
appear to have interesting psycho- 
logical implications, but can readily 
be explained simply on the basis that 
schizophrenic females are more likely 
to marry than schizophrenic males, 
and that married schizophrenic fe- 
males tend to have more children 
than their male counterparts (Essen- 
Möller, 1959). There was also a pre- 
ponderance of mentally ill daughters 
(585) as compared to sons (471). 
This is less readily explained, al- 
though the possibility that more sons 
than daughters may have emigrated 
from the areas covered by the re- 
spective studies may have been an 
important factor. The possibility 
also exists that daughters of mentally 
ill parents are more likely to be af- 


Schizophrenic disorder 39 39 35 42 
Affective disorder 
Total 


* Male patient and female relative designates father-daughter and uncle-niece combinations. 
Female patient and male relative indicates mother-son and aunt-nephew combinations. 


fected than sons. It is interesting to 
note that both the mentally ill 
mothers and fathers had more af- 
fected daughters than sons, the pro- 
portions of daughters to sons being 
56.6:43.4 in the case of the mothers 
and 54.0:46.0 in the case of the 
fathers. The slight difference 1s 1n 
favor of mother-daughter as com- 
pared to father-daughter pairs, whic 
is in keeping with the previously 
noted findings of higher concordance 
among female than other familial 
pairs. i 
With respect to Penrose’s schizo- 
phrenic group, the differences in con- 
cordance by sex are somewhat 
clearer. The certified mothers had 
slightly more affected daughters (55) 
than sons (51), whereas the fathers 
had more affected sons (28) than 
daughters (16). As a measure of the 
sex-concordance association, we may 
use the tetrachoric correlation which 
in this case is .25 or slightly less than 
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two times its standard error. Al- 
though it does not reach statistical 
significance, this finding is again con- 
sistent with the sex-concordance hy- 
pothesis. 

The findings with respect to sib- 
lings are in agreement with the Schulz 
and Zehnder studies cited above. 
There were more sister than brother 
pairs (533:372) and more same- 
sexed (905) than opposite-sexed pairs 
(654). The differences are marked 
and highly significant, statistically. 
In Penrose's schizophrenic group, the 
difference was less pronounced, but 
in the same direction. The question 
of sampling procedures which could 
be influenced by factors like differ- 
ential emigration rates in the sexes 
must be raised again. 

Such a factor would not account, 
however, for the fact that the number 
of brother pairs (372) exceeded more 
than one half the number of all 
brother-sister pairs (654). Chi square 
for this discrepancy from expectancy 
equals 5.78 which is significant at 
the .02 level. Thus again, we have a 
fairly firm finding of higher con- 
Cordance in same-sexed as against 
opposite-sexed pairs of relatives. It 
should be noted, however, that most 
of the discrepancy occurred in the 
Mott series and not at all in Pen- 
rose's group of affective disorders. 
Failure to replicate the main finding 
in the latter group may simply re- 
flect the fact that women generally 
are more susceptible to affective dis- 
orders than are men (Dayton, 1940; 
Landis & Page, 1938; Larsson & 
Sjögren, 1954; Malzberg, 1935). 

When we examine the avuncular 
Sroups, we find that whereas among 
Primary familial groups (parent- 
child and sibling) there were 872 fe- 
male pairs as against 581 male pairs, 
there were only 199 aunt-niece pairs 
as against 173 uncle-nephew pairs. 

his latter difference is not statis- 


411 


tically significant from a 1:1 ratio. 
Moreover, almost all the difference 
occurred in Penrose's group of af- 
fective disorders, a finding which is 
again probably reflecting the higher 
incidence of these disorders in fe- 
males than males. 

Comparing same-sexed with op- 
posite-sexed avuncular pairs, we find 
a significantly higher frequency of 
same-sexed pairs in the Mott study 
but not in any of the other three 
groups.’ I have no idea why Mott's 
data deviate from the others in this 
respect. With respect to Penrose's 
group of schizophrenic disorders, we 
find exactly the same number of 
aunt-niece as uncle-nephew pairs 
(39:39), and virtually the same 
number of same-sexed (78) as op- 
posite-sexed (77) pairs. Thus, the 
higher concordance of female than 
male pairs and of same-sexed than 
opposite-sexed pairs seems to obtain 
in the primary family groups but 
most of the evidence suggests that 
it no longer occurs, or occurs only 
to a very slight degree, when the 
familial relationships are one step re- 


- moved. 


One could only wish that the sam- 
pling in these studies had been of a 
higher order so that one could have 
greater confidence in drawing con- 
clusions on a point of such high 
theoretical interest. Fortunately, one 
such study exists and we may use it 
asa check on the major findings and 
inferences just discussed. 

Penrose (1942) collected what was 
virtually a consecutive series of 500 
male and 500 female patients who 


3 [t is worth noting that Slater (1953b) used 
the avuncular series in Mott's data to test the 
hypothesis of sex-linked recessives in mental 
illness and failed to find anything more than 
an infirm suggestion ofa linkage between 
mental deficiency and paranoid schiz- 
ophrenia. The Myerson (1925) and Penrose 
(1945) data are not encouraging with respect 
to finding any sex-linked recessives. 
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had been certified at the Ontario 
Hospital, London. The known rela- 
tives of each case were investigated 
and classified according to whether 
or not they had suffered from mental 
illness. Thus, this was more than an 
attempt to obtain pairs from hospital 
records only. Unfortunately, the 
classification approached complete- 
ness only in the case of parents, chil- 
dren, and sibs of the 1,000 starting 
cases, complete ascertainment of 
grandparents, uncles, aunts, nephews, 
nieces, and cousins being more diffi- 
cult to achieve. Thus, although a 
substantial improvement, sampling 
still fell considerably short of ideal. 
Relatives were divided into three 
groups: (a) Those who had been certi- 
fied (excluding cases of mental de- 
fect or epilepsy without psychosis), 
(b) those who showed signs of psy- 
chosis but who had not been certified, 
(c) a conglomerate group designated 
“psychopathic” which we shall not 
include in our discussion.* 

Rather than present the figures 
separately for each kind of familial 
relationship, I have arranged the 
pairs by sex according to whether the 
relationships were primary or sec- 
ondary. Certified and uncertified 
psychotic relatives were combined. 
In this way, we avoid too small Ns 
and yet address ourselves to our main 
hypotheses. The relevant data have 
been compiled as in Table 9. 

Itcan be seen in Table 9 that there 
were slightly more female than male 
pairs in both the primary and sec- 
ondary groups. The difference be- 


In this group were relatives described as 
nervous, excitable, hysterical, highly strung, 
neurasthenic, neurotic, hypochondriacal, de- 
pressive temperament, quarrelsome, bad 
tempered, unbalanced, peculiar, queer, eccen- 
tric, manneristic, bigamous, deserted family, 
in jail, alcoholic, or drug addict. Such a 
“group” clearly is not relevant to our inquiry, 
even if we disregard questions about the relia- 
bility of such descriptions. 
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TABLE 9 


CoNCORDANCE BY SEX WITH RESPECT TO 
PSYCHOSIS IN PRIMARY OR SECONDARY 
FAMILIAL RELATIONSHIPS 


Primary Secondary 
relationships* | relationships® 
SERE Patient Patient 
Male | Female | Male | Female 
Female 59 93 57 66 
Male 84 52 55 62 
rot = 35 rot .01 


Note. Data from Penrose, 1942. 
® Includes parents, children, and siblings. 
Includes grandparents, uncles, aunts, nephews, 
nieces, and cousins. 


tween the groups in this respect 
found in Table 8 no longer obtained. 
The findings of Table 8 with respect 
to same- and opposite-sexed pairs 
however receive strong support in 
Table 9. Among primary familial 
groupings, there were 177 same-sexed 
as against 111 opposite-sexed pairs. 
The tetrachoric correlation equals 
.35, which is 3.77 times its standard 
error, p<.01. Among secondary 
familial groupings there were 121 
same-sexed as against 119 opposite- 
sexed pairs. The tetrachoric correla- 
tion equals .01, which is only about 
one-tenth its standard error, and of 
course not significantly different from 
zero. 

From a genetic point of view, mem- 
bers of pairs in the primary group 
share one-half of a common heredity, 
whereas those in the secondary group- 
ings share one-fourth (uncles, aunts, 
nephews, nieces, and grandparents, 
or one-eighth (cousins) of a common 
heredity. These values indicate the 
theoretical expectancies regarding the 
relative proportions of explained vari- 
ance in the association between sex 
and concordance among different 
familial groupings, if genetic factors 
are solely accountable. Since there 
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appears to be virtually no association 
at all between sex and concordance in 
the secondary group, one is left to 
conclude that factors other than 
genetic ones are involved in the 
association found in the primary 
familial groupings. It is assumed that 
a more complete ascertainment of 
relatives in the secondary group 
would not change the proportions 
shown in Table 9. This assumption 
seems tenable, since factors such as 
differential sex emigration should 
apply in all degrees of familial rela- 
tionship and should be manifest in 
even a partial sample that is rep- 
resentative, but it surely warrants 
further investigation. 

If other than genetic factors are im- 
plicated in sex differences in concor- 
dance rates, one might expect these 
differences to be accentuated in 
mental disorders defined primarily as 
the sharing by two people of char- 
acteristics of the illness in which 
heredity is probably playing less of a 
role. I refer to what has been known 
historically as "disorders of associa- 
tion.” Diagnostically, patients with 
such disturbances would often be 
subsumed under paranoid schizo- 
phrenia, but commonly too under 
hypochondriasis, these disorders not 
always being easy to differentiate. 
Since 1873, the term “folie à deux" 
has been applied most often to such 
cases (Lasègue & Fabret, 1873). A 
scholarly review of this subject has 
been written in an unpublished 
doctoral dissertation by Greenberg 
(1961) submitted to the University 
of Sydney. 

A predominance of female over 
male pairs with folie à deux has been 
found by a number of investigators 
Who compiled series of cases, mostly 
from reports in the literature (Gral- 
nick, 1942, Kröner, 1891; Marandon, 
1894; Wollenberg, 1889). Since we 
do not know the factors making for 
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some cases being reported and others 
not, we are unable to estimate the 
role of possible sampling bias in these 
reports. Recently, however, Green- 
berg's (1961) study has become 
available and it is the first systemati- 
cally obtained sample of such cases, 
asfar as I know. He states: 


From the Board of Control's central registry 
of admissions to mental hospitals in England 
and Wales, instances of the more or less si- 
multaneous admission of two or more individ- 
uals, related either by blood or marriage, 
were noted for a period of five years. These, 
with the addition of several cases from the 
out-patient clinics of Guy's and St. Barthol- 
omew's Hospitals, made up a total of 114 
cases, involving 234 individuals, each of which 
was then investigated individually. A number 
of cases irrelevant to this investigation were 
then exlcuded: those involving aged sibs liv- 
ing together, whose increasing dementia had 
precipitated their admission to hospital; 
related individuals developing psychoses 
apparently quite independently of one an- 
other; and several where the available clinical 
records were such as to preclude adequate 
appraisal. ... There remained 60 instances of 
folie à deux, à trois and à quatre, involving 124 
individuals, and these were submitted to de- 
tailed investigation. . . The positive criteria 
of selection were that the subjects should show 
very similar or identical clinical pictures; that 
they wholly or partially shared the same delu- 
sions; that there was presumptive evidence 
from the history that the development of these 
states was in some way interrelated. 


The criteria for inclusion of cases 
and the sampling procedures are 
relatively specific and straightfor- 
ward. The sample turned up 13 
mother-daughter pairs as against 2 
father-son pairs, and 18 pairs of 
sisters as against 3 pairs of brothers. 
There were 6 mother-son pairs, no 
father-daughter pairs, and 3 brother- 
sister pairs. Thus, although there was 
a preponderance of same-sexed as 
against opposite-sexed pairs, the dif- 
ference was accounted for primarily 
by the relatively high frequency ol 
female pairs. Although one could 
raise the question of differentia 
fertility rates as a possible explana: 
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tion of the higher number of mothers 
than fathers, nevertheless the ratio 
of mother-daughter to father-son 
pairs was almost identical with the 
ratio of sister to brother pairs, sug- 
gesting that the sex difference here 
may have been relatively independent 
of fertility factors. This suggestion 
is supported by the fact that among 
all parent-child combinations, same- 
sexed pairs occurred 2.5 times as 
often as opposite-sexed pairs. The 
high frequency of sister pairs, plus the 
concurrent finding of equal numbers 
of brother-sister and brother pairs 
(the numbers of course are small), 
pose a constellation of frequencies 
which strain any attempt to providea 
simple genetic explanation. 
Discussion 


Although relevant to a specific 
theoretical issue, the material covered 
in this analysis is quite heterogene- 
ous, especially with respect to 
methods of sampling and diagnosis. 
In the main, the data point up con- 
cordance rates with respect to schizo- 
phrenia which are higher for female 
than male pairs and for same-sexed 
than opposite-sexed pairs of relatives 
in primary family groups but not in 
familial relationships further re- 
moved. 

There are three possible ways of 
trying to account for such findings, 
and each warrants separate discus- 
sion: (a) the findings may be arti- 
facts produced by vagaries of sam- 
pling, (b) the findings may be valid and 
explainable on a genetic basis, (c) 
the findings may be valid and ex- 
plainable on a psychological basis. 

1. In presenting the material, I 
have called attention to ways in 
which sampling procedures alone 

could have led to the findings in some 
studies. Factors such as differential 
migration of the sexes, incomplete- 
ness of ascertainment, severity of 
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illness or age of the subjects in the 
samples might have been contribu- 
tory in various ways. Can such fac- 
tors possibly account for all the find- 
ings? 

Differential migration rates must 
be considered as a serious and rele- 


vant source of sampling bias when 
pairs of relatives are ascertained only 
from records of hospitals in a fairly 
circumscribed area. The findings in 


the studies of Zehnder (1941), Schulz 
(1932), Mott (1910), Myerson (1925), 
Penrose (1945), and possibly Green- 
berg (1961) could conceivably have 
arisen from such a bias. However, in 
the studies of twins such a bias could 
not have occurred since the sampling 
involved single index cases rather 
than pairs of relatives, and since 
each ascertainable cotwin of every 
index case was included in the evalua- 
tion, even if the cotwin had emi- 
grated or died after reaching the age 
of morbidity risk. Since the con- 
cordance rates were found to be 
higher for female than male MZ pairs 
in the five major studies, and for 
female than male DZ pairs in Slater's 
study, and for same-sexed than 
opposite-sexed pairs of DZ twins in 
the studies of Rosanoff et al., Slater, 
and Kallmann, we have evidence 
that factors other than differential 
migration rates must have been in- 
volved. Also, higher migration rates 
for males than females would not by 
themselves have accounted for à 
higher incidence of brother pairs 
relative to brother-sister pairs in the 
Mott, Myerson, and Penrose (1945) 
studies combined. One would have to 
add the qualification that brothers of 
females might have been more likely 
to migrate than brothers of males. 
Of course the brother pairs must have 
had sisters in their families and 
opposite-sexed pairs must have ha 

other brothers, so that the point 
would have to be qualified further 
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with respect to the incidence of the 
sex of other sibs, straining the 
hypothesis even more. 

Moreover, we would expect that as 


the geographic area encompassed in 
the sampling increased, migration 
effects would have been propor- 
tionately reduced. Thus, sex-con- 
cordance ratios should have been 


less in a study like Greenberg's which 
covered all of England and Wales 
than in a study like Zehnder's or 
Myerson's which sampled from one 
hospital, or than in a study like 
Mott's or Penrose's, which sampled 
from one county or province. Act- 
tually, the sex-concordance ratios 
were highest in Greenberg's study, 
and there was great similarity be- 
tween the ratios in Myerson's study 
and those in the Mott and Penrose 
studies. 

Lastly, in a more fully ascertained 
sample, Penrose (1942) found higher 
concordance for  same-sexed as 
against opposite-sexed pairs of sibs. 
Thus, although they may well have 
been contributory, differential migra- 
tion rates alone could not have ac- 
counted for the array of sex-con- 
cordance findings presented here. 

Incompleteness of sampling does 
not per se constitute a serious objec- 
tion to the kinds of data presented 
unless one can point toward specific 
biases which are likely to occur on a 
Selective basis when some propor- 
tion of prospective cases is missed. 
One such bias, probably the most 
important, would be differential mi- 
gration rates in the sexes, but we have 
Just seen that this bias alone could 
not account for many of the findings 
presented, A second possible bias, 
Selective reporting of cases by sex, is 
Somewhat obviated in studies which 
ascertained cases from hospital 
records. We might, however, con- 
- Sider the possibility in these studies 
that relatives with the same family 
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name would more likely have been 
found than those with different fam- 
ily names. If so, more brother pairs 
should have been found than sister 
pairs since many sisters had probably 
married, attenuating the seriousness 
of such a possibility. However, it if is 
true that concordance rates are really 
higher for female than male pairs 
among collateral relatives as well, 
then a disproportionate loss of aunt- 
niece pairs through changes of family 
name might have occurred in studies 
which ascertained cases from hospital 
records (Mott, 1910; Myerson, 1925; 
Penrose, 1945). Such loss could 
conceivably account for the fact that 
no sex difference was found among 
avuncular pairs in these studies. 
However, in the Penrose, 1942 study 
which ascertained cases per- 
sonal inquiry among families, such 
loss is much less likely and we still 
find that the higher co 

rate among same-sexed pairs in the 
primary family groups no longer ob- 
tains in the collateral groups. The 
latter finding would have to assume à 
disproportionate loss among same- 
sexed pairs as against opposi 
avuncular pairs if the difference 
between primary and secondary fam- 
ily groups is to be attributed to sam- 
pling bias, and this assumption seems 
improbable. 

It has been shown with respect to 
MZ twins that when the severity of 
the illness in the index case is great, 
concordance is likely to be consider- 
ably higher than if the index case is 
only mildly ill (Rosenthal, 1961). 
Thus, one may wonder if a similar 
relationship obtains among other 
relatives and if such a factor could 
explain the sex-concordances re- 

rted. However, there seems to be 
no reason to think that, for example 
in the twin studies, the males would 
have been less severely ill than the 
females. It has been pointed out ina 
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few studies that males are actually 
more likely to get out of the hospital 
than females* (Lengyel, 1941; Rosen- 
thal, 1961; Sternberg, 1948), so that 
even in the twin samples which were 
overloaded with chronic, severe cases, 
most of whom were women, the males 
ascertained as index cases would 
probably have been at least as 
severely ill as the female index cases. 
Thus, on this ground alone there 
would be no reason to expect higher 
concordance in the female than the 
male twin pairs, and it seems reason- 
able that this argument would extend 
to other pairs of relatives as well. 

One might conjecture that female 
patients are more likely than males to 
be communicative about themselves 
and their relatives, making for a more 
complete ascertainment of female 
than male cases. Such a conjecture is 
plausible enough, but it would not 
explain those findings where there 
was a higher frequency of brother 
pairs than brother-sister pairs. All 
told, it seems unlikely that sampling 
biases alone could have accounted 
for the main findings in this paper. 

2. Both Penrose (1942) and Slater 
(1944, 1953b) concerned themselves 
primarily with a possible genetic 
explanation of higher concordance for 
psychiatric disorder among same- 
sexed than opposite-sexed pairs of 
relatives. According to the data 
presented here, especially those re- 
lating to schizophrenics, such an ex- 
planation needs to be expanded to 
account also for the fact that this 
relationship no longer obtains among 
collateral pairs of relatives, that con- 


5 This statement is based on data obtained 
during the first half of this century. Since 
the advent of tranquilizing drugs, differences 
between the sexes in this regard may no 
longer occur, but all the studies reviewed here 
were done before these drugs were in wide- 
spread use. 
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cordance rates are higher for female 
than male pairs, and that the fre- 
quency of admissions for schizphrenia 
is as high for males as for females, and 
maybe higher. 

Penrose's theory of two autosomal 
sex-augmenting genes could be modi- 
fied to account for higher concord- 
ance in female than male pairs. It 
would only be necessary to postulate 
that the effect of the male-augment- 
ing Gene B is greater (has higher ex- 
pressivity) than that of the female- 
augmenting Gene A. Thus, sisters 
affected with Gene B would both be 
more likely to manifest aspects of 
sexual inversion than both brothers 
affected with Gene A, and the sisters 
would therefore be more highly pre- 
disposed to schizophrenia. However, 
even with this modification, one 
should still find these factors operat- 
ing in collateral as well as primary 
family groups and one should also 
expect to find a greater number of 
females admitted for schizophrenia 
than males. Since the best evidence 
available goes contrary to these ex- 
pectations, the expanded theory still 
would not be adequate. 

The same point would apply to 
Slater's theory (1944) which was 
drawn along lines similar to Pen- 
rose's, but which objected to the 
specific "genetic" characteristics 
assigned a key etiological role by 
Penrose. Slater preferred to talk 
more broadly in terms of genetic, 
biochemical, and environmental fac- 
tors which influenced expressivity 
and penetrance of the inherited trait 
(schizophrenia), supposing only that 
some of these were more similar for 
same than opposite-sexed pairs. 
Some of these factors could also have 
been said to have higher expressivity 
in females than males, thus account: 
ing for the higher concordance foun 
in female pairs. However, further 


hypothetical qualifications would be 
necessary to try to account for similar 
or lower admission rates for females 
than males and the virtual disappear- 
ance of the sex-concordance ratios 
among collateral relatives. These 
qualifications, even if conceptually 
possible, would strain the hypothesis 
considerably, making it unusually 
complex and unwieldy. Moreover, 
the theory would eventually have to 
specify what these factors might be 
and how they might facilitate or 
inhibit manifestation. Until it did, 
the theory could not be evaluated 
further. 

3. Psychological theories to ac- 
count for the above findings would 
not at all be hard to come by, and 
the task really becomes one of ex- 
cluding from consideration loosely 
conceptualized formulations which 
seem to be able to account for findings 
in one direction as well as another. 
Even if we limit ourselves to sound 
experimental and statistical studies in 
the psychological literature, we find 
such an abundance of heterogeneous 
Studies of sex differences in personal- 
ity and behavior that a thorough 
analysis of them could not be at- 
tempted here. The findings are not 
always consistent, and methods, 
Measures, source and age of subjects, 
and research goals differ widely 
among studies. Therefore, I shall 
selectively present a few studies 
which appear relevant to our in- 
quiry, briefly examining their ex- 
Planatory power with respect to our 
Major findings. The studies chosen 
bear on two lines of thought, not 
necessarily incompatible: one is the 
anxiety-generalization hypothesis" 
of schizophrenia and the other has 
been called ‘‘sex-role identification.” 
I do not present these as "theories" 
nor do I wish to convey the impres- 
Sion that I espouse either one. 
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present them to show that findings 
already exist in the psychological lit- 
erature which are at least consistent 
with the sex-concordance rates pre- 
sented above, to illustrate in two 
ways how findings obtained in study- 
ing the genetics of schizophrenia may 
be related to some studies of per- 
sonality traits, to point up some 
problems which such lines of thought 
are confronted with when they are 
considered as possible factors ac- 
counting for the sex-concordance 
rates reported, and to imply that 
such lines of thought are deserving of 
a fuller critical and theoretical exposi- 
tion than can be attempted here. 

If we assume that high levels of 
anxiety predispose to schizophrenia 
(Mednick, 1958), and that at pre- 
psychotic stages high anxiety levels 
are often manifested as “neurotic” 
traits, then, based on the above data, 
we should expect the correlation of 
such traits to be higher for female 
than male pairs and higher for same- 
sexed than opposite-sexed pairs of 
relatives. Olson (1929) studied the 
“nervous habits” of 201 pairs of 
siblings in elementary school. The 
correlations were .32, .16, and .09 
for sister, brother, and sister-brother 
pairs, respectively. Crook (1937) 
administered the Bernreuter Per- 
sonality Inventory to 503 pairs of 
college students and found correla- 
tions for "neuroticism" of 435,522; 
and .13 for sister, brother, and sister- 
brother pairs, respectively. Carter 
(1933) administered the same inven- 
tory to 117 pairs of twins. Correla- 
tions for neuroticism were 61% 32 
and . 18 for MZ, same-sexed DZ, and 
opposite-sexed DZ pairs, respectively. 
The higher correlation for same-sexed 
than opposite-sexed DZ twins is 
consistent with the findings presented 
earlier, but the higher correlation for 
MZ than same-sexed DZ twins sug- 
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gests that inherited factors may be 
importantly involved in neuroti- 
cism. Eysenck and Prell (1951) main- 
tained this point of view, but a com- 
parison of their study with Carter's 
indicates that each was measuring 
quite different traits labeled neurot- 
icism by the respective authors. 
These correlations are consistent 
with the main findings of this article. 
Two studies of parent-child pairs 
regarding neuroticism on the Bern- 
reuter are not quite as consistent 
(Crook, 1937; Hoffeditz, 1934). In 
both studies, mother-daughter pairs 
had higher correlations (.27 and .57) 
than father-son pairs (.06 and .05), 
but opposite-sexed parent-child pairs 
tended to have higher correlations 
(.23, .01, .23, .30) than father-son 
pairs. There is a suggestion that the 
ordering of most of these correlations 
could be accounted for in part by the 
assumption of a higher incidence of 
neurotic traits in females than males 
generally. There are a number of 
studies which support such an as- 
sumption (Castaneda, McCandless, 
& Palermo, 1956; Hattwick, 1937; 
Jersild & Holmes, 1935; Mathews, 
1923; Rosenblum & Callahan, 1958). 
However, if such factors have rele- 
vance for the found concordance rates 
by sex with respect to schizophrenia, 
they should also predict a higher in- 
cidence of schizophrenia in females 
than males, which is not the case, at 
least for hospital admissions. It may 
be possible that more females develop 
milder forms of the illness and are 
therefore less likely to be hospital- 
ized, accounting for the discrepancy 
from the prediction. Supporting this 
possibility is the well-established fact 
that females are generally admitted 
to hospitals later in life than males 
(Landis & Page, 1938; Larrson & 
Sjógren, 1954, Malzberg, 1935) and 
the finding of a much higher fre- 
quency of males than females with 
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the most severely disorganizing sub- 
type of the disorder, hebephrenia 
(Schulz, 1932). 

If, however, anxiety predisposesto 
schizophrenia, and females are more 
anxious, and more males are hospital- 
ized for the disorder, at least earlier 
in life, then we seem to be confronted 
with an apparent contradiction, Tt 
may be that (speaking in Mednicks 
terms) generalization of anxiety OC- 
curs more readily in males than fe 
males, fostering illness-inducing anx- 
iety-generalization spirals more fre- 
quently in males. Some evidence for — 
such a possibility may be noted ina 
study by Sontag (1947) who found 
that girls were physiologically more 
reactive to stress than boys, but re- 
covered more quickly. 

At least one additional line of 
thought needs to be presented in an 
accounting of the main sex-concord- 
ance findings in this paper. If the 
aforementioned data are valid in that 
the sex-concordance ratios found in 
primary family relationships do not 
obtain among collateral relatives, We 
are led to infer that some factors 
peculiar to the structure of nuclear 
family life are contributing to those 
ratios. What might these factors be? 
Ina previous paper, I examined Jack- 
son's (1960) “confusion of identity 
hypothesis as one such possible factor 
factor and found it wanting (Rosen- 
thal, 1960). y 

Psychologists have been paying 
increasing attention to a related con- 
cept borrowed from psychoanalysis, 
called "identification," but they do 
not always agree on its definition, 
either in conceptual or operation 
terms. One aspect of the concept 
which has been studied in some deta! 
has been called sex-role identification. i 
To have explanatory power with 1 
spect to the sex-concordance ratios f 
above, sex-role identification should 
be demonstrably greater among le 
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males than males, and among same- 
sexed than opposite-sexed pairs of 
family members. 

Because children of both sexes have 
most contact during early rearing 
with their mother, they first identify 
with her, but the male child soon 
shifts to a masculine identification 
(Lynn, 1959). By age 3, most chil- 
dren are able to make sex-role dis- 
tinctions and this knowledge in- 
creases with age (Brown, 1956, 1958; 
Fauls & Smith, 1956; Rabban, 1950; 
Sears, Maccoby, & Levin, 1957). The 
girl retains her feminine identifica- 
tion, which is apparently not weak- 
ened even though she is likely to go 
through a stage of developing a pref- 
erence for more masculine activities. 
Girls are in more direct contact with 
their mothers than boys are with 
their fathers, so that whereas the girl 
is more likely to identify with a 
specific feminine model, viz., mother, 
the boy tends to identify with the 
cultural stereotype of masculinity 
rather than directly with his father 
(Lynn, 1959; Stoke, 1950). Women 
tend to be more like their mothers 
than their fathers in areas of major 
interest, but the reverse is not true 
for men (Beier & Ratzeburg, 1953; 
Gray & Klaus, 1956). However, both 
men and women tend to perceive 
themselves as more like their same- 
sexed than their opposite-sexed parent 
(Beier & Ratzeburg, 1953; Crook, 
1937; Gray & Klaus, 1956; Sopchak, 
1952). Boys with older sisters tend to 
be substantially more feminine than 
bovs with younger sisters, but sisters 
with older brothers show only a slight 
increase in masculine traits as com- 
pared to girls with younger brothers 
(Brim, 1958; Brown, 1956; Koch, 
1955). In the main, these findings 
lend support to the hypothesis that 
identification with same-sexed family 
members is stronger in females than 
males, 
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Using an operational definition of 
identification which was then related 
to patterns of psychopathology, Sop- 
chak (1952) found that male subjects 
who had tendencies toward abnor- 
mality on the MMPI showed a lack 
of identification in varying degree 
with their fathers, their mothers, and 
"most people," in that order, the 
lack of identification in each case 
being positively correlated with scores 
on the Schizophrenia scale. Female 
subjects who had tendencies toward 
abnormality showed positive but not 
significant correlations between 
identification with mother and all 
types of abnormal trends. These 
findings could be interpreted in ways 
which would make them consistent 
with the sex-concordance ratios re- 
ported above, suggesting further lines 
of research. 

Even though the bodies of data 
presented have suggestive value in an 
accounting of the sex-concordance 
ratios found in studies of schizo- 
phrenia, the role of genetic factors 
cannot be excluded. However, if the 
found sex-concordance ratios are 
valid, it seems reasonable to conclude 
that some psychological factors are 
influencing these ratios in good part. 
Such a conclusion would be in accord 
with a previously reported finding of 
a group of schizophrenic cases where 
the genetic contribution to etiology 
was either minimal or absent (Rosen- 
thal, 1959), and with the finding that 
hereditary factors were not account- 
ing for as much of the variance with 
respect to schizophrenia as some 
leading investigators had supposed 
(Rosenthal, 1960). 


SUMMARY 


The literature regarding concord- 
ance rates with respect to schizo- 
phrenia among relatives of both sexes 
is reviewed. These rates are gen- 
erally found to be higher for female 
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than male pairs and higher for same- 
sexed than opposite-sexed pairs of 
reatives in primary family groups, 
but not among collateral relatives. 
The possible role of sampling errors, 
genetic contributions, and psycho- 
logical factors in generating such 
sex-concordance ratios is examined. 
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It seems reasonable to infer that 
psychological factors are influencing 
the sex-concordance ratios. Two 
lines of thought in the psychological 
literature, "anxiety-generalization" 
and “sex-role identification," are 
briefly discussed as such possible 
factors. 
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THE PERCEPTION OF DEPTH THROUGH MOTION. 


MYRON L. BRAUNSTEIN? 
University of Michigan 


The classical cues to depth percep- 
tion, as outlined in almost every gen- 
eral psychology text, do not ade- 
quately handle a class of depth 
phenomena which has occasionally 
been described in the literature and 
has only recently been systematically 
studied. The common characteristic 
of these phenomena is this: A vis- 
ual pattern, which when stationary 
isreported to appear two-dimensional 
by most observers, is transformed in 
some manner, and upon viewing this 
transformation at least some ob- 
servers report seeing a form moving in 
other than the frontal plane, seeing a 
three-dimensional object in motion, 
or seeing a three-dimensional scene. 
In most cases, the cues of binocular 
disparity, convergence, relative size, 
interposition, linear perspective, 
aerial perspective, motion parallax, 
light and shade, and accommodation, 
às classically defined, are ineffective, 
for the stimuli are abstract figures 
projected onto a flat surface. 

“Motion perspective," J. J. Gib- 
son's (1950) name for the perspective 
of change of position, as contrasted 
with the more familiar perspective of 
position, is closely related to these 
phenomena. Asan object moves, the 
projections of its surface features and 
of its contours undergo certain reg- 


1 This paper is based on a chapter in a 
dissertation submitted to the Department of 
Psychology at the University of Michigan in 
partial fulfillment of the requirements for the 
PhD degree. 

2 Now at Cornell Aeronautical Laboratory, 
Incorporated. 

The author is grateful to J. J. Gibson of 
Cornell University for his critical reading of a 
preliminary draft of this paper. 
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ular transformations. As the observer 
moves, the entire retinal image un- 
dergoes similar transformations. It is 
the transformation of the entire ret- 
inal image which Gibson refers to 
as motion perspective. The trans- 
formations which particular objects 
or patterns in motion undergo will 
also be termed motion perspective in 
this paper, for there does not appear 
to be any reason for a general distinc- 
tion between these aspects of depth 
perception, and no suitable label has 
been applied to transformations of 
parts of the visual field as a cue to 
depth. 

Almost none of the psychological 
studies carried out more than 10 
years ago, and few of the recent ones 
which illustrate motion perspective, 
were systematic attempts to study 
this aspect of depth perception. The 
earliest references to the part played 
by continuous transformations of 
the projections of objects in depth 
perception treat either the perception 
of motion in depth, or its partial 
failure, as an illusion. 


DEPTH ILLUSIONS BASED ON MOTION 
The Windmill and Fan Illusions 


Sinsteden (Boring, 1942, p. 270), 
Kenyon (1898), and Johnson (1927) 
reported discoveries of an illusion 
connected in the former case with à 
distant windmill and in the latter 
cases with a two-bladed electric fan. 
The blades of the windmill or fan 
were sufficiently distant from the ob- 
server to render most of the cues to 
their relative distance ineffective. As 
the blades moved, their direction © 
rotation was ambiguous and would 
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appear to reverse from time to time. 
Of interest here is Kenyon's report 
that the blades could at times be seen 
as rotating or oscillating in three di- 
mensions, and at other times could be 
seen as expanding and contracting 
in two dimensions. Miles (1929, 
1931) has discussed this illusion and 
demonstrated it using what he calls a 
"kinephantoscope." This consists 
of a two-bladed fan rotating between 
a light source and a milk glass. 
Almost all observers at some time re- 
ported seeing the fan rotating. Rota- 
tion was seen in both directions. 
Most observers reported seeing the 
blades expanding and contracting in 
two dimensions at other times. In a 
second experiment, most observers 
reported being able to see the kind of 
motion called out by the experimenter 
within an allotted time. 


Lissajous Figures 


Persons working with oscilloscopes 
are usually familiar with another 
"illusion" of motion in depth. If two 
oscillators are connected to an 
oscilloscope, one to the horizontal and 
one to the vertical input, and are ad- 
justed to frequencies in simple 
numerical ratio, Lissajous patterns 
result. By slight mistuning of the 
frequencies, the patterns can be set 
into apparent motion. They may be 
perceived as rotating about a vertical 
Or horizontal axis, depending upon 
Which input receives the higher fre- 
quency. Speed of rotation varies with 
amount of mistuning. Complexity 
of the pattern isa function of the ratio 
of the frequencies used, 1:1 giving a 
circle, 2:1 a two-looped figure, etc. 
Direction of rotation and brightness 
of the pattern may also be readily 
varied. There is no "perspective" in 
Lissajous patterns and they con- 
Sequently resemble three-dimensional 


wire figures shown in parallel projec- 
tion. 

Rotating Lissajous patterns were 
introduced to the psychological litera- 
ture by Weber (1930), who used two 
tuning forks arranged at right angles 
with mirrors pasted to their tips, 
which reflected light onto à screen. 
Weber found that with the tuning 
forks set in nearly a 1:1 ratio, à 
circle would be seen alternately 
rotating about each of two perpen- 
dicular diagonal axes. He discussed 
attitudinal influences on whether or 
not the figure was seen as rotating in 
three dimensions, and upon the direc- 
tion of perceived rotation. 

Philip and Fisichelli (1945) used an 
oscilloscope and electronic oscillators 
to study parameters influencing the 
rate of reversal of apparent move- 
ment in Lissajous figures. Using fre- 
quencies in the ratios 4:1, 6:1, and 
8:1, they instructed observers to 
press a key when the direction of 
movement of the pattern seemed to 
reverse. Increase in complexity of 
the figures and increase in speed, to a 
lesser degree, were found to enhance 
the rate of apparent reversal. Wide 
individual differences were found in 
the number of reversals. As would be 
expected in the case of parallel pro- 
jections according to Gibson (1957), 
there was no significant overall 
preference for left or right direction of 
movement. 

Fisichelli (1946), using the same 
apparatus, investigated the effects of 
axis of rotation and height-width 
ratio of the pattern upon the rate of 
apparent reversal, finding more re- 
versals with a horizontal axis and a 
limited effect of height-width ratio. 
In both studies, observers reported 
brief interruptions of continuous 
rotary movement during observation 
of the figures. These were called 
“wiggle,” “flickering,” and “move- 


424 


ment two ways at once," and indi- 
cated perception of a two-dimen- 
sional rather than a three-dimen- 
sional figure. From the point of view 
of the present paper, it is unfortunate 
that there has been no subsequent 
research on the factors determining 
whether motion is seen in two dimen- 
sions or three dimensions when vari- 
ous Lissajous patterns are displayed 
under controlled conditions. 


Stereokinetic Phenomena 


A class of depth "illusions," to 
which Musatti (1924) has applied the 
term "stereokinetic," appeared in the 
European literature several decades 
ago. A theoretical discussion of the 
effect, along with an extensive 
bibliography, may be found in a 
paper by Musatti (1931). The follow- 
ing description of the phenomena 
will be based on Metzger's text 
(1953, Ch. 13). 

The effect is produced by rotating 
certain patterns about an axis parallel 
to the line of regard. The physical 
motion thus takes place in the frontal 
plane. However, these two-dimen- 
sional figures, which give little or no 
impression of depth when viewed at 
rest, may take on a three-dimen- 
sional appearance while being ro- 
tated. Several examples of this effect 
follow: 

If two ellipses are drawn such that 
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the center of the smaller is between 
the center and edge of the larger, 
while the minor axes are on the same 
line, and the figure is rotated as de- 
scribed above, it may take on the 
appearance of a “lampshade,” rising 
from the plane on which the figures 
are drawn. If an ellipse is drawn with 
clockface markings and an arrow is 
drawn along its minor axis, rotation 
of the figure may cause the arrow to 
appear perpendicular to the plane of 
the clockface, pointing outward in the 
"third dimension" (Figure 1a). Fig- 
ures composed of interlocking ellip- 
tical rings may yield perceptions of 
solid objects. Metzger (1953, pp. 
334-335) presents four drawings 
which, when rotated, may appear to 
be a vase, a wineglass, an hourglass, 
and a double basin, respectively. 

A recent empirical study of the 
stereokinetic effect was reported by 
Wallach, Weisz, and Adams (1956). 
In one experiment, a white ellipse 
pasted onto a black cardboard disk 
was rotated at 20 rpm. After 30 
seconds of binocular observation and 
10 to 60 seconds of monocular ob- 
servation, if an observer had failed to 
report observing a circular disk 
roling around on its edge, this 
possibility was suggested to him. Of 
47 observers, 6 reported seeing the 
tilted disk during monocular observa- 
tion, without suggestion; 34 reported 
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Fic. 1. Examples of patterns which may appear three-dimensional when rotated. (Shown 
are: Metzger's "self-willed arrow"; the overlapping rings of Wallach, Weisz, and Adams; 


and Fischer's offset circles.) 
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seeing it only after suggestion; 7 re- 
ported never seeing it. Most ob- 
servers were able to describe aspects 
of the tilting disk not suggested by 
the experimenter such as slant 
changes, indicating that they were 
not merely repeating the experi- 
menter's suggestions. 

A similar procedure was used in à 
second experiment, in which the 
stimulus was a pattern of six over- 
lapping rings (Figure 1b). None of 12 
naive observers reported seeing the 
figure as three-dimensional when 
stationary, but 10 reported it as such 
during binocular observation of its 
rotation, and one more during monoc- 
ular observation, leaving only one 
who required suggestion. The three- 
dimensional form was described as 
resembling a bedspring. 

Fischer (1956) systematically 
studied the effects of several factors 
on the stereokinetic effect. These 
were off-set (the extent to which two 
circles overlapped, varying from con- 
centric to tangential), placement (of 
the circles on the turntable), monoc- 
ular versus binocular observation, 
equality versus inequality of circle 
size, and direction of rotation of the 
turntable (Figure 1c). Eleven stimu- 
lus figures were used. The observer 
was asked to estimate the "amount" 
of depth perceived by adjusting a 
sliding gauge. 

For a depth effect to be obtained, it 
was necessary and sufficient that the 
distance between the centers of the 
two circles be greater than zero but 
less than ri Tre, or 2r in the case of 
equal circles. Within these limits, 
amount of depth judged increased 
with increasing distance between the 
centers. A greater amount of depth 
was judged with monocular viewing. 

A differential size cue was neither 
necessary nor sufficient to produce a 
depth effect. Concentric circles were 


generally reported as flat when ro- 
tated, despite size differences, while 
off-set circles of the same size elicited 
reports of perceived depth. 

Fischer also eliminates motion 
parallax as a sufficient cue on the 
basis of a consideration of the differ- 
ential rates of physical motion of the 
circles drawn on the rotating disk, 
Tangential circles had differential 
rates of motion but did not yield re- 
ports of depth perception, although 
when off-set was also present, the 
more rapidly moving circle tended to 
be judged as the nearer. 

In addition to studies treating as- 
pects of motion perspective as illu- 
sions, there has been research con- 
cerned primarily with other aspects 
of perception which has turned up 
instances of perceived depth in the 
absence of other cues. 


Metsger's Research 

Metzger (1934a) devised apparatus 
for studying “phenomenal identity,” 
consisting of a turntable placed be- 
tween a light source and a translucent 
screen. Rods could be placed in vari- 
ous positions on the turntable, stand- 
ing upright. The subject could not see 
either end of the rods through the 
screen, and of course could not see 
the turntable. The distal stimulus 
was a number of parallel shadows of 
vertical lines, moving back and forth 
across the screen as the turntable re- 
volved.  Metzger's purpose was to 
observe the changes in perceived 
identity which occurred when two 
shadows crossed and separated again. 
But he found an unexpected effect 
which he followed up in a subsequent 
study (1934b). Instead of shadows of 
lines moving back and forth in the 
plane of the screen, the subjects fre- 
quently reported perceptions of rods 
rotating in the third dimension. 
Imaginary lines connecting the pro- 
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jections of these rods would change 
only in length as the turntable ro- 
tated, rather than in both length and 
direction, as Wallach and O'Connell 
(1953) considered necessary for a 
kinetic depth effect. Metzger's 
stimuli gave rise to reports of vary- 
ing perceptions, readily influenced by 
suggestion, and were similar in effect 
to the “windmill” and “fan” illusions, 
discussed earlier. 


Johansson's Research 


Johansson (1950) employed ap- 
paratus which permitted movable 
objects to be projected onto a trans- 
lucent screen. The objects were 
drawn or pasted onto celluloid disks. 
As many as six disks could be em- 
ployed simultaneously, and six 
mechanical systems independently 
controlled their motion. The disks 
could be moved in horizontal, ver- 
tical, sloping, circular, or elliptical 
paths, in a frontal plane. Shadows of 
the objects in motion were viewed by 
the observers through the translucent 
screen. 

In a number of Johansson’s experi- 
ments, the observers reported per- 
ception of movement in three dimen- 
sions, although the distal stimulus 
was always two-dimensional. A 
three-dimensional perception was 
generally reported as secondary to a 
more easily elicited two-dimensional 
perception. 

In one experiment a row of six 
bright spots on a homogeneous dark 
field was displayed. The two end 
spots were stationary. The two spots 
located one position from either end 
were simulataneously moved up and 
down within the same time interval, 
but with an amplitude approxi- 
mately double that of the outer two. 
The reported perceptions were of 
spots on a harmonically swinging 

line. Several observers also reported 
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perceptions of the spots as being 
knots on a rope swinging in the third 
dimension. 

In another experiment, two dark 
spots on a homogeneously bright 
field were moved along perpendicular 
paths such that they met (and fused) 
at the center of their respective 
paths. There were several varieties of 
two-dimensional perceptions re- 
ported, but these did not include the 
“veridical’’ perception of two spots 
moving along perpendicular paths. 
Instead, the spots were reported as 
appearing to be two spots moving 
along a common sloping path, at 
times penetrating or passing through 
one another, and at times colliding 
and recoiling. The three-dimensional 
perception reported was that of the 
spots forming the terminals of a rod 
perpendicular to the frontal plane 
which ascends and descends an in- 
clined axis on the frontal plane. 

Additional reports of three-dimen- 
sional perceptions occurred in some of 
the many other experiments reported 
by Johansson. 


MOTION As A CUE IN DEPTH 
PERCEPTION 


Research designed for the systemat- 
ic study of motion perspective, or of 
the "kinetic depth effect" as the 
aspect of motion perspective under 
consideration has been called by 
Wallach and O'Connell (1953), has 
come primarily from three sources: 
J. J. Gibson and his associates at 
Cornell University, Wallach and his 
associates at Swarthmore, and B. F. 
Green and his group at the Mas 
sachusetts Institute of Technology: 
Lincoln Laboratory. 


The Kinetic Depth Effect 


In a series of experiments, Wallach 
and O'Connell (1953) investigate 
the conditions leading to what they 
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termed the kinetic depth effect. The 
effect is said to occur when a form 
placed between a point light source 
and a translucent screen casts a 
shadow which appears two-dimen- 
sional when the form is at rest, but 
casts shadows yielding perceptions of 
a three-dimensional form, when the 
object is rotated. 

Solid forms and wire outline figures 
in various shapes were rotated about a 
vertical axis. When such transforma- 
tions resulted in shadows having con- 
tours which simultaneously changed 
in both length and direction, most ob- 
servers reported three-dimensional 
perceptions. The light source was 
sufficiently distant to result in nearly 
parallel projections. In some cases, as 
would be expected, perceived direc- 
tion of rotation appeared to be a 
chance matter, and spontaneous re- 
versals of direction occurred. 

In one experiment, three rods 
meeting at a point at angles of 110 de- 
grees were rotated. If the ends of the 
rods were visible, three-dimensional 
perceptions were elicited, but if the 
ends were concealed, two-dimensional 
perceptions were reported, indicating 
that changes in direction of contours 
(i.e., sizes of angles) without changes 
in length of contours is insufficient 
for the kinetic depth effect. 

In another experiment, a T shaped 
wire figure and a wire equilateral 
triangle were rotated. The former 
Was reported to appear two-dimen- 
sional while the latter was described 
as three-dimensional, indicating that 
changes in length of contours without 
changes in direction are insufficient 
for the kinetic depth effect. Other 
experiments demonstrated that if 
Shadows consisted of several in- 
variant elements, a kinetic depth 
effect was elicited if imaginary lines 
connecting the elements changed in 
both length and direction. 
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In a subsequent study of the 
kinetic depth effect (Wallach, O'Con- 
nell, & Neisser, 1953), wire figures 
were placed between the light source 
and screen which yielded shadows 
when at rest which were reported to 
appear two-dimensional by a major- 
ity of the observers. When a figure 
was turned back and forth, most ob- 
servers reported perceiving shadows 
of three-dimensional forms. After 
intervals of from several minutes to a 
week, most observers when presented 
again with stationary shadows, now 
reported perceiving them as three- 
dimensional, without rotation. Re- 
versals of the Necker-cube type 
occurred after prolonged exposure to 
such stationary figures, indicating 
that the memory effect was resulting 
in more than a tendency to report 
three-dimensional perceptions when 
presented with a stationary figure 
previously shown in rotation. 


Accuracy of Kinetic Depth Perception 


In a recent systematic study of 
kinetic depth perception, White and 
Mueser (1960) employed the type of 
display introduced by Metzger 
(1934a). Pegs were inserted into 
holes on a turntable which was 
located between a distant light source 
and an aperture covered by a trans- 
lucent screen. When two pegs placed 
equidistant from the center of rota- 
tion were displayed as the turntable 
was rotated at a speed of 40 rpm, the 
observers reported perceiving mo- 
tion in three dimensions during at 
least half of the exposure time. Dura- 
tion of three-dimensional perception 
was increased with the use of fixation 
points to the left or right rather than 
at the center of the display, the use 
of pegs discriminably different in 
shape rather than identical in shape, 
and the use of horizontal as com- 
pared to vertical motion. 
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than would be expected by chance “shadow transformer." |t consisted 
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(1957) sough 
to answer four questions concerning 


affect the accuracy of the judgments? 
Finally, how accurate are the slant 
judgments obtained when a perspec- 
tive view of a rotated plane is shown 
without showing the transformation 
leading up to it? 


basically of a turntable placed be 
tween a point light source and s 
translucent window. Four patterns, 
“an amoeboid group of amoeboid 
dark shapes or spots (the irregular 
texture), a solid amoeboid contour 
form (the irregular form), a square 
group of dark squares (the regular 
texture), and a solid square (the 
regular form)" were rotated to angles 
of from 15 degrees to 70 degrees 
(Gibson & Gibson, 1957, p. 132) 
Unlike Green's dot figures, to be de- 
scribed below, Gibson's spots do, of 
course, change in shape and size dur- 
ing rotation of the plane. 

All observers observing the pat 
terns in rotation reported seeing à 
constant shape changing in slant, 
although some reported that the dis- 
play could at times be scen as the 
compression of a two-dimensional 
pattern. Slant judgments, made by 
adjusting a circular model, were in 
good psychophysical correspondence 
with the length of the transforma- 
tion sequence. The use of form 
versus texture showed no effect on 
the accuracy of slant judgments 
The regularity of the pattern had at 
most a slight effect. A con 
group, which was shown motionless 
slanted patterns, tended to see the 
rotated irregular textures as being in 
the plane of the screen and although 
they reported perceiving slant in 
the rotated regular patterns, 
grossly underestimated the degree 
slant. h 

Theimportance of actually viewing 
the rotation to accurate judgment 
slant was questioned by Sid 
(1958). Using apparatus similar tO 
Gibson’s, Sidorsky rotated a 
pattern of outline squares about à 
horizontal axis from 0 degrees to 40 
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The observers were shown 
osiy static views of the grid, at 2- 
degree steps, A shutter 
closed as the plane moved between 
views. Slant judgments were about 
as accurate as those obtained by 
Gibson and Gibson (1957) for planes 
shown in motion, and accordingly far 
more accurate than those obtained by 
them using static perspective views. 
J. . Gibson! attributes this dis- 
crepancy in results to the difference 
between the instructions used by 
Gibson and Gibson (1957) and those 
used by Sidorsky (1958). Sidoraky's 
instructions, according to 
gave his observers "' i 
formation." 

Somewhat related data comes from 
Langdon (1951), who had the ob- 
servers match a slanted circle to each 
of 15 ellipses. All were fluorescent 
wire outline figures, and were viewed 
monocularly with head movements 
avoided. Constancy was found lack- 
ing when the observer was required to 
adjust the circle to match an ellipse, 
but was restored when regular rotary 
motion of the circle was displayed and 
the observer was asked to press a but- 
ton when the standard and compari- 
son figures matched. Degree of con- 
stancy was found to vary directly 
with rate of rotation. 

Langdon explains the effect of 
rotation in terms of the “creation of 
an object" resulting from the regular 
Changes in shape of the wire circle. 
His study points up the relationship 
between kinetic depth phenomena 
and the shape constancy problem. 
The extent to which his results are 
attributable to the differences be- 
tween the psychophysical methods 
employed in the two conditions (ad- 
justment and limits), particularly to 


? Personal communication, July 1959. 


Am 
the differing modes of response, is un- 
certain. 

Green's Computer Method 


Green (1957, 19594, 1959) bas 
introduced methodology allows 


computers, such as the I 
704 and 709, can be equipped with a 


cathode ray tube (CRT) output re- 
corder. Instructions may be in- 
cluded in a computer which 
will cause a spot to on the face 
of the CRT at 


A camera at- 
tached to the CRT records each spot 
as it appears. The spot disappears in 
less time than it takes to plot the next 
point, allowing for even exposure of 
the spots. The shutter remains open 
while the points are plotted, permit- 


be used to 
frame (IBM, 1955). It is thus possi- 
ble to points in a pattern, 
— the pattern, compute the 
location of the points at small inter- 
undergoes à 
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dimensional projection of any mathe- 
matically specifiable transformation 
of any figure. 

Essentially, a three-dimensional 
pattern (or a one-dimensional or two- 
dimensional pattern in three dimen- 
sions) is conceived or randomly 
generated. The coordinates of the 
points in the pattern are listed in an 
n matrix where each row rep- 
resents a point and the column en- 
tries are x, y, and z coordinates, re- 
spectively. Successive orthogonal 
transformations of the pattern may 
be accomplished by multiplying this 
matrix by a 3X3 orthogonal trans- 
formation matrix. A two-dimensional 
projection may be made of each of the 
points represented in an 2 X3 matrix 
by multiplying both the x and y co- 
ordinates of each point by E-F/ 
E-Z where Z is the z coordinate of the 
point, E is the distance of the projec- 
tion point from the origin of the dis- 
play, and F is the distance of the pro- 
jection plane from the origin of the 
display (Green, 1959b). 

Green has carried out a series of 
studies “‘to determine the conditions 
under which the two-dimensional pro- 
jection (e.g., shadow) of a rotating 
three-dimensional figure is perceived 
as a rigid coherent figure with 
depth" (1959a, p. 9). In these ex- 
periments the observers are shown 
motion pictures prepared as described 
above and asked to rate each stimulus 
film "on a subjective scale of co- 
herence or rigidity—according to the 
degree to which the parts of the figure 
seem to maintain the same relative 
positions as the figure moves." Rat- 
ings of coherence were found to in- 
crease with number of elements in 
the figure. Rated coherence was 
greatest for patterns shown rotating 
about a vertical axis, intermediate for 
patterns “tumbling” about a fixed 
origin and lowest for patterns rotat- 
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ing about a skew axis, at an angle of 
45 degrees from the vertical and the 
same angle from the plane of the dis- 
play. Speed of rotation had little 
effect on the ratings except in the 
case of the slowest speed used, which 
showed a decrease in rated coherence. 

Green's results with line segments 
(1959 a) were similar with respect to 
speed, numerosity, and method of 
rotation. More rated coherence was 
found for a given number of line seg- 
ments than for that number of spots. 
Unlike the projections of the points, 
which did not change in size or shape 
as the patterns rotated, the projec- 
tions of the line segments changed in 
length as the projections of their end- 
points changed in distance on the 
plane of the display. Rated co- 
herence was greater for connected 
line segments than for unconnected 
ones. The connected line segments 
produced figures similar in appear- 
ance to the wire figures used by 
Wallach and O'Connell (1953) in the 
study described above. 


DISCUSSION 


Much of the work done in the area 
of kinetic depth perception was ac- 
complished by psychologists of the 
gestalt school, and it is not surprising 
that gestalt principles have been 
applied to these phenomena. In a 
chapter on space perception, Koffka 
(1930) contends that whether à 
stimulus is seen as two-dimensional 
or three-dimensional depends upon 
which mode of organization allows 
for greater symmetry and unity. Re- 
ferring specifically to the perceP- 
tion of figures in motion, he postu- 
lates a "tendency to make the total 
path (of all moving parts) as simple 
and well-shaped as possible“ (1935, P. 
301). We are left, as is usual in the 
case of gestalt explanations, with the 
problem of determining what the 
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simplest, most symmetrical motion 
would be. One might readily deter- 
mine which of several shapes is the 
simplest and most symmetrical, but 
when dealing with perceived motion, 
how does one, a priori, decide whether, 
for example, expansion-contraction 
or complete rotation has better 
"form"? Despite such theoretical dif- 
ficulties, gestalt psychology has done 
much to fill the void in perceptual re- 
search left by behaviorism, and the 
theoretical commentaries of Wallach 
and his associates, and of Metzger, 
are worth reviewing. 

The strongest, and perhaps the 
most controversial point made by 
Wallach is that, since any single pro- 
jection of the objects used as shadow- 
casters in his kinetic depth studies 
does not look three-dimensional, 
“the perceived three-dimensional 
form is not determined merely by 
what is presented on the retina at a 
given moment" (Wallach & O'Con- 
nell, 1953, p. 207). Instead, it is 

necessary to ascribe to a memory 

trace the power to determine the 
organization of a visual form process" 
(Wallach, O'Connel, &  Neisser, 
1953, p. 364). 

Gibson (1957) opposes the reducing 
of motion perspective to remember- 
ing: 

Does a stimulus last for a second, a milli- 
second, or a microsecond? ... Is is not 
theoretically preferable to suppose that a 
transformation is a stimulus in its own right, 
just as a nontransformation is a stimulus? 
Or, still better, that sequence, as well as pat- 
tern, is a variable of stimulation? (p. 136).* 


Metzger (1953, p. 335) also argues 
that the motion itself, and not a 


* The concept of a transformation serving as 
a stimulus is also found in the biological 
model of perception of Pitts and McCulloch 
(1947). Of particular relevance here is their 
formulation of the exchangeability of time 
and space in form perception. 


succession of stationary forms, should 
be considered the stimulus. He goes 
on to make the assertion that the 
principles applicable to stationary 
patterns are also applicable to pat- 
terns in motion, if instead of consider- 
ing their three-dimensional form, we 
consider the form of their motion 
(1953, p. 345). That is, as he pre- 
viously proposed (1934b, p. 258), 
tendencies toward unity, simplicity, 
symmetry, continuity, and good form 
of the ongoing motion form the basis 
of kinetic depth phenomena. Metz- 
gers formulation is similar to 
Koffka's, and is subject to the same 
difficulties. 

A perceptual theory handling the 
phenomena described in this paper, 
and suggesting numerous problems 
for experimental investigation, is that 
of J. J. Gibson. Basic to his theory is 
the posulate that “The stimulus- 
variable within the retinal image to 
which a property of visual space cor- 
responds need be only a correlate of 
that property, nota copy of it“ (1950, 
p. 8. Kinetic depth perception 
would then be studied by seeking 
the stimulus correlates for depth, in 
the optic arrays associated with 
moving objects, and Gibson has 
stated an hypothesis about what 
these are: "Any regular transforma- 
tion of a bidimensional image tends 
to yield a tridimensional motion in 
perception, and the kind of motion 
perceived depends on the kind of 
transformation" (1954, p. 311). This 
principle is expressed again in the 
postulate that "An eye is a device 
which registers the flow pattern of an 
optic array as well as the static pat- 
tern of an array. Conversely, such a 
family of continuous transformations 
is a stimulus for an eye. There are 
quite specific forms of continuous 
transformations, and the visual sys- 
tem can probably discriminate among 


432 


them” (1958, p. 185). It follows that 
"A psychophysics of kinetic impres- 
sions would require a mathematical 
analysis and classification of the 
motions or transformations of a 
retinal image" (1954, p. 312). Such 
an analysis might begin with a con- 
sideration of the kinds of motions 
possible in an optic array, and Gibson 
(1954, 1957) dicusses these motions. 
There are the rigid motions of transla- 
tion and rotation, which, since each 
can occur with respect to either of 
three axes, give us six continuous 
perspective transformations. There 
are also nonperspective or elastic 
transformations, characteristic of liv- 
ing organisms, and finally disjunctive 
motions of the parts of a pattern. 
Gibson (1951, pp. 404-405) dis- 
putes the classical assumption that 
two-dimensional vision is immediate, 
primitive or sensory, while three- 
dimensional vision is secondary, 
derived or perceptual" and suggests 
that other theories may fail to be 
convincing in their explanations of 
three-dimensional perception because 
they are guided by this assumption. 
Certainly the inverse assumption 
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would lead to a different orientation 
in the kind of research discussed in 
this paper, for the classical assump- 
tion seems, along with technical 
difficulties in experimentation, re- 
sponsible for the fact that although 
systematic work has long been carried 
on with stationary two-dimensional 
forms, kinetic depth phenomena, un- 
til recently, were only "curious illu- 
sions." 

Gibson's theory is a promising 
approach to the systematic study of 
depth perception, and his research 
has contributed considerably to our 
knowledge of this area. It is Green's 
empirical approach, however, largely 
because of its more sophisticated 
methodology, which lends thegreatest 
promise to the actual development of 
a psychophysics of depth perception. 
Although the contribution in the area 
of mechanics of creating stimuli is 
important, in that it has made such 
research more practical, it is the crea- 
tion of stimuli directly from mathe- 
matical formulae which should prove 
of especially great significance in the 
study of complex perceptual phe- 
nomena. 
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Scientific interest in color began in 
the latter part of the seventeenth 
century with the research of Newton 
(1672, 1757) on light and colors. Of 
historical significance was his dis- 
covery that white (or grey) and all 
other colors are, in fact, reproducible 
with a mixture of two or more kinds 
of homogeneous light (Newton, 
1704). Though his concepts regard- 
ing the perception of colors were tra- 
ditional in the sense that physical 
properties of objects were transmit- 
ted to the sensorium, it can be said 
that he influenced subsequent theo- 
ries by assigning a definite role to the 
optic nerve fibres. In a letter to the 
Royal Society in 1675 (cf. Newton, 
1757) he suggested that light rays 
excited vibrations in the retinal 
terminations of the optic nerve, that 
the vibrations were transmitted by 
the optic nerve fibres to the sen- 
sorium, and that here the different 
colors were experienced according to 
the strength and mixture of their 
vibrations. In a fancied analogy with 
the notes of a musical scale, Newton 
supposed there were seven kinds of 
light, each with its characteristic 
vibration rate. 

During the eighteenth century 
some physicists expressed the opinion 
that a minimum of three elementary 
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or primary colors were necessary to 
reproduce all the known colors 
(Helmholtz, 1867, 1924-25).  Ex- 
perimenting with mixtures of colored 
pigments they concluded that red, 
yellow, and blue were the primary 
colors, green being excluded since it 
was obtainable with mixtures of yel- 
low and blue substances. Wunsch 
(Góthlin, 1943), experimenting with 
mixtures of spectral lights, concluded 
that red, green, and violet were the 
primary colors. In 1758, the mathe- 
matician Tobias Mayer put forward 
the view that all spectral lights were 
mixtures of three kinds of light, 
namely, red, yellow, and blue. In 
1777, Giros de Gentilly (1785) argued 
not only for a trichromasy of physical 
light but also for a trichromatic phys- 
iological mechanism in the retina. 
He speculated that color perception 
was mediated by three types of mem- 
branes or molecules, each selectively 
sensitive to one of the three kinds of 
light. He made the original sugges- 
tion that defective color vision was 
due to the inactivity of one of the 
membranes or groups of molecules. 

By the end of the eighteenth cen- 
tury it had become quite clear that 
although perceived colors may be 
limited, white light actually consists 
of an infinite number of rays, differ. 
ing from each other in color and 
refrangibility. In his discussion of the 
physical theory of light, Thomas 
Young (1802b) found it necessary to 
modify Newton's theory concerning 
the perception of colors. He pointe 
out that since: 
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it is almost impossible to conceive of each 
sensitive point of the retina to contain an in- 
finite number of particles, each capable of 
vibrating in perfect unison with every possible 
undulation, it becomes necessary to suppose 
the number limited; for instance, to the three 
principal colors red, yellow and blue. 


It was possible, he said, that each 
sensitive filament of the nerve con- 
sisted of three portions, one for each 
principal color. A year later, Wol- 
laston's (1802) description of the 
spectrum led Young (1802a, 1807a, 
1807b) to modify his previous re- 
marks regarding “the proportions of 
the sympathetic fibres of the retina.” 
He now substituted red, green, and 
violet for red, yellow, and blue. 

In 1794, the distinguished chemist 
John Dalton (1798), presented to the 
Manchester Library and Philosophi- 
cal Society, a dramatic account of his 
own peculiar vision for colors. While 
it is certain that such persons with 
defective color vision were not new to 
the human race, recorded instances of 
them can be traced only to the seven- 
teenth and eighteenth centuries 
(Huddart, 1777; Turberville, 1684; 
Whisson, 1778). Dalton’s testimony 
stimulated scientific interest in the 
phenomenon of color blindness. He 
said: “In the solar spectrum three 
colors appear; yellow, blue and pur- 
ple. The two former make a con- 
trast; the two latter seem to differ 
more in degree than in kind." (What 
basis Dalton used for naming his 
colors is a mystery! The names are 
almost but not quite what one would 
expect from a unilaterally deuteran- 
opic subject who can give "normal" 
names to colors seen in the color- 
blind eye [Graham & Hsia, 1958.) 
Dalton supposed his anomaly was 
due to a color medium in his eye 
Which absorbed only the red and 
green rays. At his request, a post- 
mortem examination of his eyes was 


made (Henry, 1854) and his theory 
disproved. 

Thomas Young (1807a, 1807b) 
ascribed Dalton's peculiar color vi- 
sion to "the absence or paralysis of 
those fibres of the retina which are 
calculated to perceive red." A con- 
troversy immediately arose as to 
whether Dalton and others like him 
were totally insensitive to the red end 
of the spectrum (Seebeck, 1837; Wil- 
son, 1855). Dalton's statements had 
failed to make this point clear. 

Of the early attempts to classify 
color-blind persons (Purkinje, 1828; 
Seebeck, 1837; Szokalski, 1841; 
Wartmann, 1846; Wilson, 1855), that 
of Seebeck's received the most atten- 
tion. He used the first formal screen- 
ing tests for the classification of color 
vision types. Though red-green con- 
fusion was common to all his observ- 
ers, they fell into two groups regard- 
ing the visible limits-of the solar 
spectrum. The first group reported 
seeing the normal limits and called 
the spectral colors “blue” and “red.” 
The second group was relatively in- 
sensitive to the red end of the spec- 
trum and called its colors “blue” and 
"yellow." We know today that color 
names given by color-blind persons 
are unreliable, depending, as they 
do, on learned cues of brightness, 
saturation, texture, Or position. The 
only justifiable method of identifying 
the colors these persons see in the 
spectrum is to obtain certain specific 
types of experimental data from uni- 
laterally color-blind persons with 
normal color vision in the “good” eye 
(Graham & Hsia, 1958). 

In a letter to Dalton in 1833, Her- 
schel made some significant contribu- 
tions to color theory (Henry, 1854). 
He introduced the following defini- 
tion of normal and color-blind vision: 
In normal color vision all colors can 


be referred to a mixture of three pri- 
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maries, while in color-blind vision all 
colors are referable to a mixture of 
two primaries. Herschel wrote, 
Now, to eyes of your kind, it seems 
to me that all your tints are referable 
to two, which I shall call A and B; the 
equilibrium of A and B producing 
your white...." Another color- 
blind scientist, William Pole (1857), 
interpreted this statement to mean 
that persons like him should see grey 
(or white) in the place of green. 
Later, James Clark Maxwell noted 
(1855a) that his color-blind subject 
saw a white in the blue-green band of 
the spectrum. This band, which came 
to be known as “‘the neutral point,” 
has played an important role in color 
theory and the diagnosis of color 
blindness. 
In Herschel's letter there also ap- 
peared for the first time the terms 
"dichromic vision" and “dichroma- 
tism" (Henry, 1854). To him these 
terms described two-color vision, 
that is, vision for blue and yellow. 
Though he introduced both terms at 
the same time, dichromic was gener- 
ally adopted and used for several 
years. Wartmann (1846) confused 
the issue with his dichromatic Dal- 
tonism" which referred to vision 
where all colors are seen as shades of 
grey. The term dichromatic finally 
superseded dichromic when Donders 
(1884) and Kénig (1903) applied it to 
the more frequent types of color- 
blind persons who require a mixture 
of two monochromatic lights to 
match the spectral colors. 

By the middle of the nineteenth 
century many speculations concern- 
ing color perception had been ad- 
vanced (Wartman, 1846; Wilson, 
1855). In 1852, Helmholtz revived 
Young’s theory in connection with 
his well-known experiments concern- 
ing spectral complementaries (Helm- 
holtz, 1852), only to reject or ignore 
it (Helmholtz, 1853, 1855). An im- 
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portant outcome of these experiments 
was his explanation as to why the 
laws of color mixture sometimes 
break down for mixed pigments. For 
example, pigment mixtures of blue 
and yellow give you green, but a mix- 
ture of blue and yellow spectral lights 
gives you grey or white. He pointed 
out that the final wave lengths re- 
flected by colored substances are 
those wave lengths of incident light 
that remain after successive selective 
absorptions by the colored media 
(Helmholtz, 1852). Failure to under- 
stand this principle vitiated many of 
the interpretations of workers before 
Helmholtz. 


CoLoR MIXTURE 


Maxwell’s early color mixture ex- 
periments (which date from 1852) 
were conducted with rotating disks of 
pigment colors (Maxwell, 1855a, 
1855b) and were followed by experi- 
ments with spectral colors (Maxwell, 
1860). Maxwell was responsible for 
the resuscitation of Young’s theory, 
for from the years 1855 to 1860, he 
repeatedly worked out the implica- 
tions of color mixture and color 
blindness for this theory. In 1855, he 
described theoretical response curves 
for red, green, and violet nerve sys- 
tems (Maxwell, 1855a) and in 1860, 
computed three such curves from ex- 
perimental data for normal observ- 
ers. As predicted by Herschel, his 
color-blind observer required only 
two primaries to match the colors of 
the spectrum. Consequently, color- 
blind vision was represented gr aphi- 
cally with two response curves (Max- 
well, 1860). It was only in 1860 that 
Helmholtz came out firmly in sup- 
port of Young's theory (Helmholtz, 
1867) and continued to be its cham- 
pion until his death in 1894. Max- 
well’s publications on color vision 
ceased with the year 1871. 

Maxwell (1860) was influenced by 
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Brewster in the graphical representa- 
tion of his curves of primary intensi- 
ties" and their relationship to the 
spectral brightness or luminosity 
curve. Brewster (1834), like Mayer, 
supported a triple-spectrum theory 
and his views received much atten- 
tion (Helmholtz, 1924-25). Helm- 
holtz's (1867) diagrams of the theo- 
retical response curves which first ap- 
peared in 1860 bear a striking resem- 
blance to Brewster's diagrams of 
separate and superposed intensity 
curves of the triple spectrum. 
Maxwell popularized the use of an 
equilateral triangle to describe and 
predict the data of color mixture. 
Such geometrical representations orig- 
inated with Newton (1704) who used 
a circle to illustrate the results of 
color mixture. Later, Tobias Mayor 
used an equilateral triangle for the 
same purpose (Forbes, 1849). The 
chromaticity diagram of today is 
a right angle triangle that represents 
approximately the hue and saturation 
of all colors in terms of the three pri- 
maries red, green, and blue. 
Maxwell's pioneer color mixture 
experiments are not precise by mod- 
ern standards. They were repeated 
by Kónig and Dieterici (1886) and 
Abney (1913) who made a more sys- 
tematic study of the problem with 
normal and color-blind observers. 
Their data were made the basis for 
the three response or excitation 
curves that were recommended as 
Standard in 1922 by the Optical So- 
ciety of America (Troland, 1922). 
The observations of König and 
Dieterici, and Abney, however, left 
something to be desired both in the 
apparatus used and in the number of 
observers studied (Wright, 1946). 
About 25 years ago, careful rede- 
terminations were made independ- 
ently by Guild (1931) and Wright 
(1929) and were found to be in agree- 
ment. Their results provided the basis 
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for new standard trichromatic mix- 
ture curves which were adopted in 
1931 (Commission Internationale de 
l'Éclairage, 1932). For dichromatic 
color mixture, Pitt's (1935) data are 
considered the most precise. 


LUMINOSITY 
Newton (1704) had observed: 


The most luminous of the Prismatic Colours 
are the yellow and orange. These affect the 
senses more strongly than all the rest together, 
and next to these in strength are the red and 
green. The blue compared with these is a faint 
and dark Colour, and the indigo and violet are 
much darker and fainter, so that these com- 
pared with the stronger Colours are little to be 
regarded. 


About 100 years later, Fraunhofer 
(1824) attempted to measure the 
relative brightness of spectral lights. 
Until 1883 however, all luminosity 
measures were relative, since methods 
to measure the absolute energy of 
spectral wave lengths did not exist. 
In 1888, Langley published luminos- 
ity curves for three observers, and he 
plotted, against wave length, the 
absolute energies that would give 
equal acuity, and presumably equal 
brightness (Langley, 1888). He 
started the modern practice of taking 
the reciprocals of these energies as 
measures of the sensitivity of the eye, 
with the result that a “bell-shaped” 
curve was obtained. 

Modern methods followed to ob- 
tain luminosity data either involve 
heterochromatic brightness matches 
(Gibson & Tyndall, 1923; Ives, 1912) 
or absolute threshold measurements 
(Graham & Hsia, 1954; Hecht & 
Hsia, 1947). Luminosity curves are 
not directly comparable since energy 
distributions differ materially in the 
spectra of the various sources used. 
To make such curves comparable, it 
is common practice arbitrarily to cor- 
rect them for an equal energy distri- 
bution. 
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Gibson and Tyndall (1923) made a 
graphical compilation of all existing 
luminosity data (for daylight vision) 
where more than 10 observers pro- 
vided the measurements. They de- 
rived a luminosity curve for a total of 
200 observers as representative of 
both the equality-of-brightness and 
the flicker methods. This curve was 
adopted as a standard in 1924 and its 
validity reaffirmed in 1939 (Gibson, 
1940) by the International Commis- 
sion of Illumination. 

Studies of the dichromatic luminos- 
ity function have been reported since 
1879 (Hecht & Shlaer, 1936). It ap- 
peared that Seebeck's (1837) two 
kinds of color blindness were ac- 
counted for by these researches. 
Some of the dichromats apparently 
had a normal luminosity curve with a 
maximum around A555 my while 
others had a marked loss in sensi- 
tivity at the long wave length end 
and a maximum around 540 my. 
Von Kries (1897) introduced what he 
considered were nontheoretical Greek 
names for these two classes of dichro- 
mats, viz., deuteranopes and pro- 
tanopes. Actually, protanopia liter- 
ally means blindness to the first pri- 
mary and deuteranopia, blindness to 
the second, and hence these names 
are not free from a theoretical conno- 
tation. The modern tendency is to 
view the terms protanopia and deu- 
teranopia as synonymous with "first 
type" and "second type," respec- 
tively. 

The more recent experiments of 
Pitt (1935) seemed to verify the re- 
sults of the earlier studies of the di- 
chromatic luminosity function. How- 
ever, Hecht and Hsia (1947) and 
Graham and Hsia (1954) have 
pointed out that the arbitrary prac- 
tice of plotting luminosity curves 
with their maxima set at 100% 
masks any differences of shape or 
height among the curves. They de- 
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termined threshold energies through- 
out the spectrum for normal and di- 
chromatic observers and did not 
adopt the misleading practice of set- 
ting the maxima at 100%. When the 
reciprocals of absolute or relative 
energies are presented this way, 
protanopes show a loss of sensitivity 
for the red and yellow wave lengths 
and deuteranopes a loss for the green 
and blue wave lengths, as compared 
to the normal trichromat. An in- 
creasing number of investigators re- 
port that deuteranopes show a defi- 
nite loss of sensitivity at the short 
wave length end of the spectrum and 
that their luminosity function is un- 
like that of normal observers (Gra- 
ham & Hsia, 1960). 


WAVE LENGTH DISCRIMINATION 


Color vision researchers have also 
used wave length discrimination data 
as a basis for theory. In 1867 Man- 
delstamm reported such data for the 
normal eye (1867). Several investi- 
gations of this problem have since ap- 
peared in the literature (Ladekarl, 
1934; Laurens & Hamilton, 1923; 
Wright & Pitt, 1934). The experi- 
mental method most frequently used 
involves a comparison of two con- 
tiguous monochromatic lights of the 
same wave length, first equated for 
color (hue and saturation) as well as 
brightness. The wave length of one 
of the lights is then varied until the 
observer reports a just-perceptible 
color difference. The brightness of 
the wave length being compared is al- 
ways equated before the decision 
about color is made. This procedure 
is adopted to control the Bezold- 
Briicke phenomenon (Purdy, 1930, 
1937), viz., the fact that spectral 
lights vary in color with changes in 
luminance levels. The difference 
threshold is generally expressed in 
terms of the just perceptible change 
in wave length (Ad). Some studies ex- 
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press the mean error of the difference 
thresholds as a function of the stand- 
ard spectral wave lengths (Kónig & 
Dieterici, 1886; Ladekarl, 1934; Lie- 
berman & Marx, 1911). 

When the method of just-percepti- 
ble difference is adopted, the results 
take the form of a curve which ap- 
proaches the base line at parts of the 
spectrum where discrimination is 
good and recedes from it where dis- 
crimination is comparatively poor. 
In other words, the minima in the 
curve correspond to relatively higher 
sensitivities than the maxima. For 
normal observers, wave length dis- 
crimination is largely determined by 
hue differences. Steindler (1906) 
found four regions of minimal thresh- 
olds in the violet, blue-green, yellow, 
and red, the curve taking the shape 
of four successive troughs. She veri- 
fied the existence of the minimum at 
the red end of the spectrum by 
matching the wave lengths for bright- 
ness with a nichol prism and using 
red filters to guard against stray light 
(p. 50). Jones (1917) and Laurens 
and Hamilton (1923) obtained ap- 
proximately the same *empirical 
curve. The results of Wright and Pitt 
(1934) were in fair agreement with 
the three researches just mentioned 
except that the minimum in the red 
(around 630 my) is missing. Other 
modern researches report maximum 
sensitivity for wave length discrimi- 
nation at only the blue-green and 
yellow portions of the spectrum (Cor- 
bett, 1937; Ladekarl, 1934; Malling, 
1919; Roaf, 1927). In view of these 
varying results it is not surprising 
that a standard wave length dis- 
crimination curve is lacking. 

Donders (1884) and Kónig and 
Dieterici (1886) were among the 
earliest to report the wave length dis- 
crimination of dichromats. They 
were followed by several others 
(Hecht & Shlaer, 1936; Ladekarl, 
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1934). The empirical results have 
been remarkably similar. Both pro- 
tanopes and deuteranopes show the 
same reduced capacity for wave 
length discrimination, their thresh- 
old discriminations taking the form 
of a U shaped curve, with a single 
minimum in the blue-green wave 
lengths around 500 my. On either 
side of this minimum, discrimination 
begins to deteriorate, being extremely 
poor in the violet and green portions 
of the spectrum. No color differences 
can be detected in the long wave 
lengths and after about 530 my.’ 


SATURATION DISCRIMINATION 


In everyday speech the saturation 
of a color is referred to with adjectives 
such as pale, "weak" or “ight,” 
“strong,” "dark" or "deep." Even 
after a cursory examination of the 
spectrum, persons with normal color 
vision report that the spectral colors 
appear unequally saturated, the col- 
ors at the extremes being more 
saturated than the colors in the mid- 
dle. Early experimentation with sat- 
uration depended entirely on methods 
involving rotating discs of pigment 
colors (Parsons, 1924). To this day, 
we lack precise experimental data on 
the relative saturation of spectral 
wave lengths, first, because the prob- 
lem of quantifying saturation judg- 
ments has not been satisfactorily re- 
solved and second, because a suitable 
experimental method of obtaining 
these measurements has not yet been 
demonstrated. 

The attention of experimenters has 


3 Balaraman, Hsia, and Graham (1962) 
found unreliable (though recordable) wave 
length discrimination thresholds for two 
deuteranopes and three protanopes in the 
long wave lengths. Further improved research 
with similar observers might indicate whether 
some dichromats see more saturation differ- 
ences at the long wave length end than 
others, or whether these observers are partial 


dichromats. 
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hitherto been focused largely on 
stimulus variables that control spec- 
tral saturation. One such variable is 
luminance level (Purdy, 1931). As 
you increase the luminance of a dim 
monochromatic light, the saturation 
increases until a level is reached where 
saturation is at a maximum. Further 
increases result in the diminution of 
saturation until at certain high lumi- 
nance levels, spectral wave lengths 
lose all color and appear white to the 
observer. 

Another saturation variable is dem- 
onstrated as follows: Starting with 
two equally bright white lights A and 
B, the experimenter adds varying 
amounts of a given monochromatic 
light to A. When the observer re- 
ports that A is just-perceptibly 
colored, the experimenter begins to 
add varying amounts of the same 
monochromatic light to B. The ob- 
server now indicates when B is just- 
perceptibly more saturated than A. 
By this step-by-step method of com- 
parison, so-called "saturation steps" 
are measured. When the number of 
saturation steps is plotted against 
wave length, a V shaped curve is ob- 
tained with a minimum at about 570 
my (Martin, Warburton, & Morgan, 
1933). Such saturation steps refer to 
a stimulus consisting of spectral wave 
lengths and nonspectral white, thus 
confusing the issue of spectral satura- 
tion. 

In the above experiment, amounts 
of the monochromatic light are added 
in such a way that the total lumi- 
nance of the mixture field is main- 
tained constant. The procedure re- 
quires that when the luminance H of 
the monochromatic light is increased, 
the luminance of the white light Lo 
must be decreased by the same 
change in luminance. It has become 

common practice to express the first 
saturation step from white in terms of 
the ratio of luminances Ly to Le. 


This ratio is called the least or mini- 
mum colorimetric purity, or p. The 
reciprocal of p is generally used for 
the reason that it conveniently repre- 
sents a poor discrimination by a low 
value and a better discrimination by 
a high value. When the reciprocal of 
f is plotted against wave length, a V 
shaped curve is obtained, with a 
minimum in the 575 my region (Hart- 
ridge, 1950; Jameson & Hurvich, 
1955). Since this V shaped function 
roughly correlates with the subject's 
verbal report about the relative satu- 
ration of spectral wave lengths, it has 
become customary to speak of colori- 
metric purity as a derived measure of 
saturation discrimination. Actually, 
p cannot be a measure of the relative 
saturation of spectral wave lenghts 
since by definition p has a value of 1.0 
for any spectral light that is added to 
white light (Graham, 1959). 

For the dichromats, the reciprocal 
of p increases rapidly from the neu- 
tral point and towards the short wave 
lengths. The function rises rapidly at 
first from the neutral point and 
towards the longer wave lengths up 
to 530 mu where it gradually levels 
off. Minimum p values occur near 
500 mu for deuteranopes and 490 mp 
for protanopes (Chapanis, 1944; 
Hecht & Shlaer, 1936). The color- 
imetric purity function for both types 


of dichromat is indeterminate at the. 


neutral point. This is an expected 
phenomenon since dichromats see 4 
narrow band of wave lengths near 
500 my as white or neutral in color. 
An infinite amount of such wave 
lengths can be added to a white light 
without producing any color change 
for these observers. 


TRE ANOMALOUS TRICHROMATS 


In 1881, while performing color 
mixture experiments with subjects 
thought to be normal, the third Lor 
Rayleigh (1882) found wide varia- 
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tions in the ratio of red to green lights 
mixed to match a yellow light. 
Though all 23 of his subjects required 
one particular mixture of red and 
green to match the yellow, five of 
them required more green and two of 
them more red than the others. Their 
match was not acceptable to the 
normal eye; it appeared too greenish 
or reddish. Rayleigh's dichromats 
could match a full red or green with 
the yellow by merely adjusting the 
luminance of the yellow. Later 
(Rayleigh, 1890) he found an observ- 
er who could match the green and 
not the red with the yellow. It was of 
this observer that he remarked. “It 
looked as though the third color sen- 
sation presumably red, was defective, 
but not absolutely missing.” His- 
torically, Rayleigh’s first two diver- 
gent groups came to be known as 
anomalous trichromats and in line 
with trichromatic theory it was be- 
lieved that such persons were either 
,Breen-weak" (deuteranomalous) or 
red-weak’’ (protanomalous). 

_We know today that the frequency 
distribution of the red/green ratios 
required by a large sample of normal 
Observers takes the form of a normal 
probability curve with certain cases 
falling outside of the limits of this 
curve. There is no general agreement 
às to what red/green ratios should be 
called anomalous. Some workers 
apply the term to extreme variants 
Within the normal curve (Edridge- 
Green, 1913; Nelson, 1938) and 
others reserve it for cases that lie out- 
side of the normal curve limits (Hail- 
wood & Roaf, 1937; Nelson, 1938; 
Schmidt, 1955; Schuster, 1890). 
Anomalous observers also differ in 
their range of matches for the yellow 
(Jameson & Hurvich, 1956). 

Anomalous observers generally re- 
quire three lights or primaries to 
duplicate spectral colors but in pro- 
Portions which differ from those of 
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the normal trichromats (McKeon & 
Wright, 1940; Nelson, 1938). Others 
who find two primaries sufficient to 
match certain ranges of the spectrum, 
are frequently called ‘‘the extreme 
anomalous” or partial dichromats.“ 
Such persons have a neutral point in 
the blue-green wave lengths (Abney, 
1895; Hayes, 1911; Koffka, 1909; 
Rosenkrantz, 1926). 

McKeon and Wright (1940) re- 
ported a marked loss in luminosity at 
the red wave lengths for their pro- 
tanomalous observers, similar to the 
luminosity losses of the protanopes. 
Itis not clear whether such a marked 
loss for the protanomalous is an em- 
pirical fact or an artifact arising from 
the convention of naming maximum 
sensitivity 100%. Wright (1946) is 
inclined to believe that these cases 
represent extreme protanomalous ob- 
servers and that his sample was not 
representative of various degrees of 
the defect. 

The results of Pitt (1935, Appen- 
dix 1) and Nelson (1938) are not in 
agreement concerning the deutera- 
nomalous luminosity function. Pitt 
found luminosity losses in the blue 
and the green wave lengths but 
luminosity increases at the yellow and 
orange wave lengths, as compared to 
the normal luminosity curve. Nelson 
believes his results indicate almost a 
normal luminosity function for the 
deuteranomalous. 

The wave length discrimination of 
anomalous observers is described by 
a two-minima curve (Engelking, 
1926; McKeon & Wright, 1940; Nel- 
son, 1938). For both protanomalous 
and deuteranomalous observers, 
maximum sensitivity is recorded in 
the blue-green wave lengths around 
500 mp. The second minimum in the 
longer wave lenghts is higher than 
the first, indicating poorer discrimi- 
nation. It appears in the yellow (580 
mu) for the protanomalous and in the 
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orange (620 my) for the deuteranom- 
alous. 

The colorimetric purity data for 
the anomalous also differs from that 
of the normal (Chapanis, 1944; Mc- 
Keon & Wright, 1940; Nelson, 1938). 
In general, the values of 1/p tend to 
be smaller throughout the spectrum, 
again indicating poorer discrimina- 
tion. Nelson (1938) and McKeon and 
Wright (1940) found much smaller 
variations in the curve and no marked 
minimum as compared to the normal 
observer. According to Chapanis 
(1944) protanomalous observers re- 
semble protanopes in this function 
except for a secondary dip in the 560 
my region. 

‘THE CLASSIFICATION OF 
COLOR Vision TYPES 


Modern classifications of color 
vision types rely heavily upon the 
data of color mixture since there is 
either lack of agreement or insuffi- 
cient research concerning other basic 
visual functions. It has become ac- 
cepted practice to define color vision 
types according to the number of pri- 
maries necessary to reproduce spec- 
tral colors. The normal trichromat 
requires three primaries to do this 
while the dichromat requires two pri- 
maries. Dichromats are today sub- 
divided into three classes: prota- 
nopes, deuteranopes, and tritanopes. 
Modern research with tritanopes has 
not advanced as far as it has for the 
other two classes of dichromats 
(Wright, 1952). Anomalous trichro- 
mats generally require three pri- 
maries to duplicate spectral colors 
but in proportions which differ from 
those of the normal trichromats (Mc- 
Keon & Wright, 1940; Nelson, 1938). 
They are subdivided into three 
classes: protanomalous, deuteranom- 
alous, and tritanomalous. Some 
rare individuals can reproduce all 
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spectral colors with a single spec 
light or primary. They are call 
monochromats and on the b í 
their luminosity data are subdivid 
into two or more classes (Judd, 1 
Pitt, 1944). ^ 
It has been estimated that less th 
5% of females have defective € f 
vision. The incidence among mal 
is approximately 8%, of which i 
6% are anomalous trichromats and 
2% dichromats (Wright, 19460). 


THE TRICHROMATIC THEORY — 
Normal Color Vision 


As we have already noted, 
sential aspect of this theory is 
concept that there are three sets 
cone mechanisms in the fovea, eaci 
with a given spectral sensitivity. Th 
red receptors are said to be maxi 
sensitive to the red wave lengths, 
green receptors to the green 
lengths, and the blue receptors to 
blue wave lengths. The complete 
sence of histological evidence 
three types of cones is not M 
against the theory. It is argued that 
the triple subdivision may occur 1 
the cones on a submicroscop 
scale. It is hoped that work such 
Granit’s (1947) electrophysiolo 
research and Rushton’s (1957, 
research on the foveal pigmen 
normal and color-blind eyes mà 
tablish such a hypothesis on à 
jective basis. 

The fundamental response € 
are said to be characteristic © 
three hypothetical, theoretical 
torsystems. The trichromatic th 
postulates that spectral hue 18 
pendent on the ratio, and spe 


Fundamental response curves are d 
from spectral mixture curves. The latt 
resent brightnesses of the primaries 
to match the brightness of a given $ 
wave length (from a light source of 
wave length distribution). 
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brightness (or luminosity) on the 
sum, of the ordinates at any given 
wave length. White“ is experienced 
when all three receptor mechanisms 
are in appropriate ratios (usually 
equal) stimulated, and this latter 
conditio: is assumed when all three 
ordinat: are equal. The saturation 
of perceived color is influenced by the 
ratio o! he "white-producing" ordi- 
nates to the remaining ordinate or 
ordinates. When the red and green 
receptors are equally stimulated— 
and this is assumed when the red and 
green ordinates are equal—a “‘yel- 
low" experience is initiated in the 
brain. 

The three response curves are con- 
sidered to be independent of lumi- 
nance. At first it was believed that 
hue and saturation were invariant 
with changes in luminance. When it 
was established that this expectation 
was contrary to the facts, a new hy- 
pothesis was brought forward by 
Helmholtz, viz., that the receptor 
responses do not increase in propor- 
tion to their stimuli but obey a law of 
diminishing returns (Purdy, 1931). 
For a given increase in spectral lu- 
minance, the weakest of the three re- 
ceptor systems will increase relatively 
more than the two stronger systems, 
thus increasing the amount of white 
being experienced. Spectral wave 
lengths therefore become more de- 
saturated with increases in lumi- 
nance. 


Color-Blind Vision 


It was first assumed from general 
Considerations of heredity and evolu- 
tion that normal and color-blind 
vision are related. The empirical 
Study of the latter was therefore of 
great significance for theories of 
normal color vision. It was believed 
that dichromasy was caused by the 
total loss of a fundamental system 
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(Helmholtz, 1867; König & Dieterici, 
1886, Maxwell, 1855b). Such a hy- 
pothesis seemed to explain the color 
mixture, luminosity, and wave length 
discrimination data of dichromats. 
White was experienced when the two 
mechanisms were equally stimulated, 
a hypothesis that could also be used 
to explain the neutral point. When 
monocular color-blind persons (with 
deuteranopic or protanopic vision in 
one eye) claimed they saw a blue and 
a yellow in a spectrum with the de- 
fective eye (Judd, 1948), the sup- 
porters of a reduction system were at 
aloss. The system they hypothesized 
could not explain the perception of 
yellow in dichromats. Some of them 
turned to a "fusion theory" to ac- 
count for the reported color experi- 
ences of the dichromats. 

The original statement of the 
fusion theory is attributed to Helm- 
holtz by Pole (1893). If Helmholtz 
intended to describe such a concept 
in 1867, he did not make himself very 
clear (Helmholtz, 1867, p. 848). The 
first straightforward description of it 
was given by John Aitken in 1872. 
Discussing possible changes in the 
shape and number of excitation 
curves that would account for color 
blindness, Aitken (1872) suggested 
that in some cases, the nerves might 
be so constructed that the red nerves 
might be sensitive to all the rays to 
which the green nerves are sensitive," 
so that both nerves being excited at 
the same time, "the sensation pro- 
duced would be what we call yellow.” 
Leber (1873) and Fick (1874) put 
forward the same viewpoint and sug- 
gested it as an explanation for both 
dichromatism and the extrafoveal 
color vision of the normal trichromat. 
Fick (1879, 1890) became a strong 
supporter of this theory to such an 
extent that it often bears his name. 

The fusion concept postulates that 
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the perception of yellow in color- 
blind vision originates in the central 
nervous system. While it can be 
made to account for color mixture, 
wave length discrimination, and neu- 
tral point of the dichromats, it seems 
to be irreconcilable with their lumin- 
osity functions. Helmholtz in 1885 
believed he had found an application 
of the fusion concept that would ex- 
plain the dichromatic luminosity 
function (1896, p. 369). He pointed 
out that if the green curve shifted to 
the red curve, the colors at the red 
end of the spectrum will appear com- 
paratively bright and the green wave 
lengths less bright. This condition 
would describe deuteranopia. If the 
red curve shifted over the green 
curve, there will be reduced sensi- 
tivity for red wave lengths and the 
green wave lengths around 540 mu 
will appear comparatively bright. 
This would describe protanopia. 
However, it should be recognized 
that if dichromasy represents fusion 
and not loss, the luminosity at the 
protanopic maximum (540my) 
should be higher than normal at this 
point and higher along the shorter 
wavelengths. By the same logic, the 
maximum luminosity around 570 mu 
and luminosity for the longer wave 
lengths should be higher for the 
deuteranope as compared to normal 
luminosity. Unfortunately, the di- 
chromatic luminosity curves do not 
fulfill these expectations. 

The existing contradictions be- 
tween trichromatic theories ("fusion" 
and "reduction") and the dichromat- 
ic luminosity function have led some 
theorists to take the stand that 
whereas deuteranopia represents a 
fusion of red and green excitations, 
protanopia can be explained only by 
a simple reduction system (Pitt, 
1945). Such a stand does not resolve 
the contradictions described above. 
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Other theorists feel that since the 
fusion theory explains more facts of 
color blindness than the reduction 
theory, it can be modified to ex- 
plain dichromatic luminosity losses, 
Graham and Hsia's (1958) double- 
shift concept is an example of such 
a formulation. Others postulate 
a separate white mechanism which is 
assumed to be either entirely or 
mainly responsible for brightness 
(Hunt, 1952; Piéron, 1952). 


Discussion AND CONCLUSIONS 


Normal color vision is trichromatic. 
This statement is sometimes taken 
incorrectly to mean that a spectral 
monochromatic band of wave lengths 
can be matched by a mixture of three 
primaries. What is in fact the case is 
that the monochromatic band may 
be mixed with one of the primaries to 
give a two-color mixture that can 
match the mixture of the other two 
primaries. In the algebraic represen- 
tation of the situation the primary 
that is mixed with the test mono- 
chromatic band is given a negative 
sign (as, for example, in the equation 
cC —rR-F-gG —bB, which means that € 
units of test color C mixed with b 
units of primary B (blue) match r 
units of primary R (red) plus g units 
of primary G (green). The fact that 
for color mixtures the amounts of the 
colors sum additively is called Grass- 
man’s law. As applied to /uminances 
the law is called Abney’s law, and is 
probably nearly correct for appropri- 
ate conditions of measurement. 

The data of color mixture, i.e., the 
combinations of quantities of pri- 
maries required to match mono- 
chromatic spectral colors, are given 
by the spectral mixture curves as we 
as by the chromaticity coordinates 
for which intensity of light is treat 
as constant. 

It has been suggested that the 


ative values applying to the 
turating" primary may repre- 
gent inhibitory neural effects. In 

ther cases it has been thought that 
negative values are measures of 
saturation produced by overlap- 
f of fundamental curves repre- 
enting the spectral absorptions of 
the basic receptors. The possibility 
that the negative values may be due 
the presence of more than three 
receptors or processes is usually ig- 

The trichromatic theory implies 
that one set of a possible infinity of 
ts of primaries should describe the 
a of color mixture. We know to- 
ay that three primaries, properly 
chosen, can do this and that the facts 
of color mixture can be satisfied by a 


(Hurvich & Jamison, 1955), or 
n (Hartridge, 1950). It should be 
gnized that color mixture alone 
innot provide the foundation stone 
"of the theory. 

It is occasionally admitted by sup- 
porters of the trichromatic theory, 
at the differences in spectral satura- 
are difficult to explain with the 


retical argument concerning 
e  "white-producing" ordinates 
right, 1946). Most of the pub- 


hed fundamental response curves 
v only two receptor curves from 
out 530 mu to 700 my (Wright, 
46). Thus the conditions necessary 
r white are missing here and the fact 
a yellow is the least saturated of 
tral colors unexplained. . 

The diminishing-returns hypothe- 
extended by Helmholtz cannot ex- 
n the  Bezold-Brücke phe- 
menon, i.e., the fact that changes 
| hue occur with changes in lumi- 
e, nor does it lead to a satisfactory 
anation of changes in spectral 
tion with changes in lumi- 
ce. According to the hypothesis, 
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saturation should progressively de- 
crease with increases in luminance 
and hues should be maximally satu- 
rated at threshold. As noted pre- 
viously, the experimental findings are 
that as luminance increases from 
absolute threshold to a high value, 
the saturation first increases to a 
maximum and then decreases (Purdy, 
1931). 

It is obvious that the assumption 
of diminishing returns cannot be 
reconciled with the assumption of ad- 
ditivity of luminosities (Abney's law). 
The latter hypothesis, extended to 
account for spectral brightness differ- 
ences, necessitates a fixed ratio of the 
three response ordinates for all lumi- 
nance levels. 

The experimental data from anom- 
alous trichromats are meagre, con- 
troversial, and offer difficulties to the 
theorists. Some workers attribute 
anomalous trichromasy to varying 
degrees of defect in either the red or 
green fundamental system (Pitt, 
1949; Wright, 1946) and others fall 
back on the shift-theory or its modifi- 
cations to explain it (Abney & Wat- 
son, 1913; Pitt, 1935, Appendix II). 
Nelson (1938) and Pitt (1949) suggest 
that deuteranomaly may sometimes 
be due to a red response curve that 
exceeds the height of the normal red 
curve. 

After more than a century of scien- 
tific research in color vision the tri- 
chromatic theory continues to face 
theoretical contradictions and unex- 
plained facts. Trichromatic theorists 
everywhere should rigorously ex- 
amine the theory's basic assumptions, 
provide much more experimental 
data on the basic visual functions, 
and honestly ask themselves the ques- 
tion: should the theory be subject to 
drastic revision or should it be re- 
placed by some other theory? 
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The current period in psychology is 
marked by the vigor with which 
theory and investigation are coming 
to grips with dynamic aspects of be- 
havior. To a very significant degree, 
conditions which have been visible 
mostly in the clinic or the case study 
are becoming conspicuous variables 
in experimental research. 

Presently, too, there is a rapidly 
mounting renewal of interest in 
human thinking, mostly in the areas 
of problem solving and concept for- 
mation—but with indications that 
imaginative processes are also becom- 
ing respectable again in the labora- 
tory (with an empirical orientation, 
that is, rather than from the practical 
standpoint of projective techniques). 
The great attention to creative think- 
ing in recent years is another impor- 
tant indication of this trend. 

. It is often the case that streams of 
investigation well up alongside each 
other, with their separate preoccupa- 
tions and their own technical elabora- 
tions, probably springing from differ- 
ent sources (those fascinating inno- 
vating springs in science). They may 
long continue to run happily in dif- 
ferent channels with little direct 
interfusion. So, to a considerable 
extent, it appears to have been with 
those psychological specialities called 

motivation" and “thinking.” We 
can all cite well-known names in each, 
but without, as a rule, placing a 
person in both at once. Yet thought- 


ful psychologists are fully aware of 
how artificial these distinctions really 


are. 


I have come increasingly to realize 
that there has been progressing for 
some time a courtship between 
motivation and thinking. Although 
the suitors may be coy, at times, or 
even scornful of each other, never- 
theless on the whole they have been 
approaching their wedding day with 
surprisingly clear intimations of har- 
mony. I refer, here, to academic ex- 
perimental psychology, not to those 
traditions (like psychoanalysis) where 
no division has ever really existed 
between dynamic and cognitive as- 
pects of behavior. 

This symposium brings together a 
diverse set of papers. The reasons for 
this diversity are easy to see. If you 
seek for rapprochement between the 
fields, known as motivation and 
thinking; you will soon discover that 
a wide variety of psychological work, 
theoretical as well as experimental, 
belongs together. The four papers, 
therefore, symbolize this mutuality of 
interest just as much as they repre- 
sent widely different approaches. 
They reflect the sort of effort that 
must be made if a happy marriage 
between motivation and thinking is 
to be consummated. 

We shall, therefore, let the papers 
speak for themselves. 
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The general problem with which 
this symposium is concerned is cer- 
tainly very basic in psychology—and 
it is one that I personally have found 
increasingly important as the years 
&o past. Indeed, in my opinion it 
lies at the very heart of human be- 
havior. This problem may be stated 
as the degree to which cognitive 
performance varies in content, quan- 
tity, and quality with variations in 
the motivational state of the indi- 
vidual. By “motivational” I mean 
“the... processes that instigate, 
regulate, and adjust behavior" 
(Vinacke, 1960b). One may prefer to 
conceive of such conditions as inferred 
intrinsic properties of the person—or 
he may cautiously speak only about 
the manipulations of the experi- 
menter, such as his instructions or 
the kinds of pressures (often called 
"stress") that he imposes upon his 
subjects; for our purposes, either 
la has to do with motivation. 
. Of course, psychologists by no 
means agree as yet, if they ever will, 
on answers to the difficult theoretical 
issues that arise when one invokes 
motivational determinants of be- 
havior. These issues concern the role 
played by driving as compared to 
steering aspects of the motivation 


1 This paper and the following three papers 
by Maltzman, Johnson, and Hilgard were 
presented in a symposium at the Western 

logical Association meetings in Seattle, 
Washington, June 15, 1961, with W. Edgar 
Vinacke acting as chairman. 

I am much indebted to David Crowell, 
whose thoughtful advice facilitated the writ- 
ing of this paper. Work carried out under a 
fellowship from the John Simon Guggenheim 
Memorial Foundation provided essential 
background. 
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system, the proper way to explais 
"primary" and "secondary" com 
ponents of motivation, and so om. 
Recent books by Bindra (1959) and 
Brown (1961) have sought to clarify 
these issues, and I, too, have written 
about them (Vinacke, 19602). Asse 
well revealed by Hall and Lindsey 
(1957), personality theory has ~s 
in the past and still continues to 

for a satisfactory account of motiva- 
tion. Although it is extremely tempt. 
ing to employ this forum as a meant 
to present that definitive treatment 
of motivation for which you have all 
been waiting, I must disappoint you; 
since, at the moment, I wish to speak 
about king rather than motiva- 
tion i Whatever brand of moti- 


ory you prefer, I shall 


vation à 
insist that thinking is at all times 
“the utilization of past experience i 


se to motivational sta 
(Vinacke, 1960b). "m 

In general, therefore, my thesis i$ 
that thinking is inseparable from 
motivation. We cannot possibly ad- 
vance our understanding of thinking 
without continued and intensive 
study of the motivational conditions 
which change its course ahd deter- 
mine its character. In expounding 
this theme, I shall first look briefly at 
its emergence in psychology, 
then indicate how data bearing upon 
it are accumulating at a rapid rate 
with the result that quite exciting 
vistas open for the future. 


HISTORICAL ANTECEDENTS 


There is, as you all know, nothing 
terribly new in the statements 
have just made. We have only to 
recall the classical discussion 
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attentfor—ol which Wundt's doc- 
trine of apperception is an 

striking cxample—to see that interest 
in dynamic aspects of thinking has 
had a history parallel to that of 
scientific psychology itself. Psycho- 
analysis from the very beginning has 
sen thinking as depending upon 
motivational states, stressing un- 
conscious determinants, as well as 
ego regulating processes. In Lewin's 
work, the dynamic emphasis was al- 
Ways strongly apparent, represented 
by concepts of tension and its resolu- 
tion, of valence and vector. The in- 
fluence exerted by Murray's studies 
of personality depends directly upon 
his having closely tied together cogni- 
tion and motivation. Perhaps no one 
has more clearly and forcefully 


recognized the interrelati moti- 
vation and thinking t ner 
Murphy. Indeed, his f ion of 
thinking asa continuum au- 


tistic and realistic poles can scarcely 
be improved upon: thinking is a func- 
tion of the onward-course of need- 
conditions which, however, are re- 
sponsive to the character of the envi- 
ronmental situation. 

I could go on reminding you of 
dynamic approaches to thinking, but 
perhaps this is enough to make the 
point clear that the motivated char- 
acter of thinking is far from a new 
idea in psychology. Nevertheless, 
Whereas clinical psychology is built 
directly upon motivational concepts, 
experimental psychology has only 
recently come to depend much upon 
them. Let us glance at these develop- 
ments. 


EXPERIMENTAL APPROACHES 
Perhaps it is not an exaggeration to 
Say that the past decade has witnessed 
a revolution in the psychological 
laboratory. One might say that not 
long ago the objective of an experi- 
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ment was to expose 


: il: i 
Hii: i 
1 
tipte 


H 


i 
8 
5 
. 
d 
ig 


T 
i 
Hi 


7 
: 
B 
$ 
¥ 
be 


1 
: 
| 
77i 


at least with human subjects. 
What a difference can be seen 
It has become almost as 
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have been devised to manipulate 
motivation: namely, deprivation, the 
assessment by pretest of latent 
motivational properties, and the use 
of induction to influence the subject 
in the situation itself. 

Deprivation signifies the preven- 
tion of some activity, which may be 
considered consummatory, for some 
specified period of time. Although 
this technique is widely employed 
with animal subjects, it is still in its 
infancy with human subjects. The 
pertinent research has so far chiefly 
been concerned with hunger and with 
the withholding of sensory or social 
stimulation. In general, as depriva- 
tion continues, imaginative activity 
relevant to the inferred need state 
increases (Sanford, 1936) at least up 
to an optimum point, after which 
there is a decline (Wispé, 1954). This 
probable curvilinear relation (the 
“optimum principle"), formulated in 
1908 by Yerkes and Dodson, is 
potentially of vast importance. Un- 
fortunately, most experiments have 
not been designed to reveal it, since 
it is usual to constitute groups chosen 
from the extremes of a continuum. 
With respect to problem solving, how- 
ever, it appeared in an experiment by 
Birch (1945). Chimps produced bet- 
ter solutions to stick problems instru- 
mental to food under conditions of 
moderate deprivation. 

Studies with sensory and social 
deprivation are so far mainly in the 
exploratory stages, but it is quite 
clear that there are important effects 
upon performance (Bexton, Heron, & 
Scott, 1954; Gewirtz & Baer, 1958) 
and an interesting avenue opens out 
here for future research. 

More extensively developed in re- 
search on human subjects is the 
administration of pretests, by means 
of which to identify persons who dif- 
fer in some respect, such as “‘achieve- 
ment" or "affiliation," or manifest or 
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test anxiety, or perhaps some atti- 
tudinal variable like the ''cognitive 
styles" studied by George Klein and 
his associates. It is by now familiar. 
to everyone that thematic appercep- 
tion pictures yield scores that may 
fruitfully be regarded as reflective of 
significant motivational patterns, à 
notion, of course, basic to projective 
testing. Much of the work initiated 
by McClelland (McCelland, Atkin- 
son, Clark, & Lowell, 1953) and 
pursued actively by Atkinson, 
French, and many others, has been 
methodological in character. But it 
has been established that imaginative 
response varies with latent disposi- 
tions, as well as with experimental 
inductions. Beyond this, differ- 
ences in performance on a variety of 
tasks are associated with differences 
in motives, as inferred from pretests. 
For example, persons high in affilia- 
tion are more efficient in tasks that 
require sensitivity to social cues than 
are persons low in affiliation (Atkin- 
son & Walker, 1956; French & 
Chadwick, 1956). 

The situation is more complicated 
with respect to anxiety measures, 
chiefly because it is probable that the 
effects of anxiety depend upon a con- 
siderable variety of additional condi- 
tions, like the complexity and diffi- 
culty of the task, and the kind of- 
inductions employed by the experi- 
menter. Nevertheless, a wide variety 
of effects associated with anxiety has 
been reported (Sarason, 1960). 

Some of the most promising re- 
search has been conducted by Klein 
(1954) with attitudes that he calls 
"cognitive style." This is defined as 
a regulative process that determines 
attention to the goal and how a per 
son copes with the task situation. In 
one experiment, constricted and flex- 
ible subjects were compared under 
thirsty and sated conditions. In a 
judgment task, flexible subjects ove™ 
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estimated size, regardless of degree of 
thirst, whereas constricted subjects 
underestimated. On the other hand, 
in a tachistoscopic task, there were 
no differences when subjects were 
sated, but flexible subjects were more 
variable, and also more accurate, 
under the thirsty condition. 

Still a third procedure has been 
widely employed. This technique 


which may be called "induction" 
consists in the introduction of special 
conditions into the situation itself. 


Thus, an experimenter attempts to 
influence the subject's relation to the 
goal by creating perceptions of suc- 
cess or failure. Or noxious stress may 
be imposed by the administration of 
electric shock. Or verbal instructions 
may be given to induce ego-threat. 
In all of these cases, it is regularly 
found that conditions intended to be 
unfavorable—i.e., threat, failure, 
and so on—are likely to interfere 
with performance, whereas favorable 
conditions, especially success, facil- 
itate performance. Cognitive proces- 
ses, perhaps at a rather superficial 
level, are thus readily affected by ex- 
permimental conditions. 


LEVELs OF THE MOTIVATION 
SysTEM 


Rather than present more examples 
of the effects of specific conditions, I 
should like to indicate several of the 
complexities that emerge in current 
research. The single most important 
point to be made, in fact, is that 
cognitive behavior is an outcome of 
the interplay of many forces. Since 
I suppose no one has ever really 
doubted that this is the case, perhaps 
I should say that we are learning how 
to incorporate these complexities into 
our experiments. This is one facet 
of the revolution I mentioned before. 
Instead of designing simple, single- 
variable experiments, the necessity 
arises to plan multivariable ones. It 
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is possible, of course, to look into the 
future to a time when we may be able 
to return to simpler experiments by 
way of indexes to patterns of vari- 
ables. 

This may be illustrated by refer- 
ence to achievement. As McClelland 
et al. (1953) point out, a high degree 
of achievement imagery may be a 
function (a) of a high latent disposi- 
tion, or (b) of strong achievement cues 
in a picture-stimulus, or (c) of special 
orienting instructions designed to in- 
duce achievement. That is, one may 
affect achievement scores in any of 
three different ways, so that poten- 
tially we have three different kinds of 
"high achievement subjects." But 
there may be systematic relations 
among them. For one thing, of 
course, there may be correlations 
that could be established, although 
according to a study by Marlowe 
(1959), such relations may be small. 
Instead, it might be profitable to con- 
ceptualize categories of individuals 
high on pretest, but low on response 
to special cues and experimental in- 
duction; another category might be 
high on pretest and high under induc- 
tion, but low on special cues; and so 
on (see, for example, Martire, 1956). 
One might then find a single measure 
that would permit the direct identifi- 
cation of persons who fall into each 
category. 

In the meantime, however, we 
must recognize that the motivation 
system is highly complex, and that 
several variables may simultaneously 
require measurement. I find it mean- 
ingful in this connection to speak 
about levels of the motivation sys- 
tem." We may draw inferences about 
several aspects of motivation—a 
point recognized in most theories of 
personality. A convenient distinc- 
tion is in terms of instigation or mo- 
tives (the relatively enduring and 
general forms of energy-expendi- 
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ture—what Judson Brown calls 
“sources of drive"), regulation or 
attitudes (the mechanisms which 
determine the course of activity once 
a change in instigation occurs), and 
adjustment or sets (the specific proc- 
ess that determines the immediate 
response itself). Of course, such dis- 
tinctions are bound to be artificial, 
and, in making them, I ignore many 
important and interesting issues dear 
to the heart of the motivation 
theorist. Nevertheless, to differenti- 
ate such levels casts a clarifying light 
upon the character of thinking. A 
few examples must suffice. 
Inferences about instigation may 
be based either upon conditions of 
deprivation (saying, for example, that 
the individual gets hungrier) or upon 
scores made on a test presumed to 
reflect latent tendencies like the TAT 
(thus, for instance, saying that a per- 
son is high in latent achievement). 
Two sorts of inference may be drawn: 
on the one hand, that a given condi- 
tion enables us to compare a higher 
with a lower state of instigation—for 
example, individuals low and high in 
latent achievement, or satiated and 
hungry; on the other hand, we may 
compare one kind of instigation 
(what may be called “predominance” 
of a motivational tendency) with 
another kind, for instance, achieve- 
ment versus affiliation. Both sorts of 
inference prove useful in research. 
Thus, the hungry chimp displays 
better instrumental problem solving 
than the satiated one (Birch, 1945), 
and the hungry college student 
seems to be more preoccupied with 
thoughts of food than the one who 
has just eaten (Wispé, 1954). The 
individual high in achievement will 
apparently solve more arithmetic 
problems under conditions intended 
to induce achievement than the indi- 
vidual high in affiliation, who works 
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better when the incentive is to 
"please the experimenter" (Atkinson 
& Reitman, 1956). 

Inferences about regulative mech- 
anisms may also be drawn from tests, 
which, in this case, seek to assess how 
a person typically handles a given 
kind of situation. The cognitive 
styles described by Klein belong in 
this class, as also do ego-defense and 
ego-strength processes, as well as 
values, attitudes towards problems 
and problem solving, and so on. That 
the identification of regulative proc- 
esses aids in understanding thinking 
has already been illustrated in the 
reference above to Klein's research. 
Another good example is provided in 
experiments by Schroder and Hunt 
(1957) and Scott (1956), who meas- 
ured failure-avoidance tendencies. 
Individuals marked by this character- 
istic tend in problem solving situa- 
tions to set unduly high goals, employ 
few alternative solutions, and to act 
in an unrealistic fashion. 

Finally, it pays, also, to take the 
adjustment level into account, since, 
for instance, the conditions manipu- 
lated by the experimenter in the 
situation itself may have important 
effects upon performance. Of course, 
the literature on set, which could be 
adduced as evidence, is enormous; in 
fact, the psychologist is especially 
skillful in influencing this level of the 
motivation system. The extensive re 
search on the induction of success and 
failure provides the best example. 
Particularly pertinent is the expert- 
ment by Lantz (1945) with school 
boys. After they played a game 1 
which they experienced either success 
or failure, she administered Stanfor d- 
Binet items to her subjects. Failure 
had an adverse effect upon reasoning 
tasks, whereas rote memory was not 
affected. ; 

This necessarily abbreviated dis- 
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cussion serves to demonstrate that 
complexity in thinking is a function 
of several inferably different motiva- 
tional levels. Response depends upon 
the organization of these processes 
into patterns. 


INTERACTION 


One feature of this organization 
that deserves special mention because 
of its methodological implications, is 
the interaction of one level with 
another. That is, the effect of a 
process at one level depends upon the 
characteristics of another level. For 
example, it has become apparent 
that it is not sufficient to specify 
the degree of manifest anxiety alone 
because persons “high’’ in this re- 
spect, respond differently under dif- 
ferent sorts of experimental instruc- 
tions—that is, the adjustment level 
determines response as well as what- 
ever process we infer manifest anxiety 
to represent (see, for example, Sara- 
son & Palola, 1960). 

A good instance is provided by 
Miles (1958). In this experiment, 
subjects high and low in achievement 
imagery were also classified as either 
“analyzers” or ‘‘nonanalyzers’’ on the 
basis of their approach to the Wechs- 
ler Block Design test. On a pursuit- 
meter task, high achievement analyz- 
ers proved to be superior to low 
achievement analyzers during the 
early stages of original learning, after 
which the latter performed better. In 
relearning, high achievement non- 
analyzers showed the greatest loss, 
whereas low achievement analyzers 
showed the least. 


RELEVANCE 


But, perhaps, a more fruitful way 
to look at the complexities of motiva- 
tion is in terms of the relevance of the 
situation to the inferred motives and 
attitudes of the subject. In this re- 
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spect, the character both of the in- 
duction and of the task assume a 
clear significance. Although it is 
usual for the investigator to build 
relevance into his experiment, the 
true importance of this factor only 
emerges sharply, as yet, in a few ex- 
periments. The most revealing of 
these was conducted by Elizabeth 
French (1958). She used groups of 
four subjects, each group composed 
of individuals high in either achieve- 
ment or affiliation. In addition, she 
induced individual compared to group 
orientation and also had the groups 
work either under task feedback 
(emphasizing the work situation) or 
under feeling feedback (emphasizing) 
the interaction situation). In a task 
requiring the reconstruction of stories, 
the groups high in achievement were 
more efficient with task feedback, 
regardless of orientation, whereas the 
groups high in affiliation were supe- 
rior under group orientation with 
feeling feedback. 

Other contributions to an under- 
standing of relevance have been 
made by Vogel, Raymond, and 
Lazarus (1959), Schénbach (1959), 
and others. In these experiments, it 
is shown that performance is better 
when the task is relevant to the in- 
ferred motive than when it is irrele- 
vant. 

Unfortunately, time does not per- 
mit a more detailed analysis of the 
many interesting questions touched 
upon in this paper. However, it is 
quite apparent that there is a com- 
plex relation between the quantity 
and quality of thinking and the 
motivational state of the individual. 
Experiments are rapidly showing how 
these relationships may systemati- 
cally be explored. I am sure that we 
are approaching a time when the 
dynamics of thinking will be better 
understood. 
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Motivation is a difficult and com- 
plex problem in its own right. In 
conjunction with thinking, a so-called 
complex process, the difficulties of 
analysis appear to be immeasurably 
compounded. This as well as other 
problems of thinking, however, may 
be made more amenable to experi- 
mental and theoretical analysis if we 
assume that thinking is a complex 
form of behavior involving changes 
within and between classes of re- 
sponses occurring on the basis of 
principles derived from simpler ex- 
perimental situations (Maltzman, 
1955). Such a position, of course, 
does not preclude the development of 
new concepts and principles which 
may be peculiar to thinking, but only 
that such conceptual additions be 
introduced when already established 
concepts are clearly insufficient for 
the task of analysis and explanation. 

Before proceeding with the con- 
sideration of motivation and think- 
ing, it would be advantageous to 
clarify our usage of some relevant 
terms with a simple illustration. A 
common experimental task employed 
in the study of thinking is anagram 
solving. The subject is presented 
with a series of jumbled letters such 
as a-p-h-e-c. His task is to construct 
one or more words using the letters 
presented. Presumably he has never 
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encountered this anagram before. 
The letters evoke corresponding ver- 
bal responses on the part of the sub- 
ject which when rearranged spell the 
words cheap“ and "peach." A se- 
quence of letter responses initially 
evoked must occur in a new sequence 
for a successful solution of the prob- 
lem. This formation of new or dif- 
ferent response sequences is char- 
acteristic of thinking as distinguished 
from recall. Learning in the form of 
acquisition of response strength by 
the particular solution is prohibited 
by having only a single presentation 
of this letter sequence. 

A number of different parameters 
may influence performance in such a 
situation. For example, problem 
solving difficulty may be manipu- 
lated by systematically varying the 
letter order of the anagram. Subjects 
in such experiments are usually col- 
lege sophomores who have repeatedly 
encountered the words peach and 
cheap. These responses have been 
well learned, are familiar, although 
not to the anagram. A measure of 
availability of the solutions may be 
obtained from the Thorndike-Lorge 
word count: Relevant learning of the 
final solution response may be a con- 
stant in the experiment which may, 
for example, be concerned with the 
effects of instructions on problem 
solving. The influence of past learn- 
ing may be studied in such a situation 
by holding other characteristics of 
the anagrams and word solutions 
constant while varying Thorndike- 
Lorge frequencies of the solutions, 
thereby varying the extent of prior 
exposure to the solution. Instructions 
may be varied by specifying in vary- 
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ing degree the response class to which 
the solutions belong. The anagram 
may be clearly printed on an index 
card and visible to the subject during 
his problem solving efforts. How- 
ever, the anagram may also be ex- 
posed tachistiscopically for brief 
intervals, or the clarity of the type 
may be systematically varied. The 
three initial conditions of instructions, 
word frequency, and clarity of type, 
and the consequent frequency of cor- 
rect solutions, may be used to define, 
respectively, the influence of motiva- 
tion, learning, and perception upon 
the disposition called thinking. In a 
factorial design it would be possible 
to obtain an estimate of their inde- 
pendent effects upon the final re- 
sponse chain representing the solu- 
tion. In such experimental situations 
it is possible to isolate the effects of 
different variables, including motiva- 
tion, upon thinking. In other situa- 
tions it is not possible, very often be- 
cause there is no independent experi- 
mental operation available for manip- 
ulating the variable in question. The 
solution to this problem is simple. 
Such experimental situations should 
not be used. They do not represent 
potentially fruitful approaches to 
serious experimental work on think- 
ing. 


Motivational Concepts 


With respect to motivation per se 
we may be brief. The most exten- 
sive development of experimentally 
grounded motivational concepts is 
undoubtedly the S-R behavior theory 
formulations (Hull, 1943; Spence, 
1956, 1960). Since there are many 
excellent discussions of the relevant 
motivational concepts, they need not 
be indicated in detail here (Brown, 
1961; Farber, 1955). 

These motivational concepts may 
be classified as either associative or 
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nonassociative in nature in terms 
their effects upon behavior. Effec 
drive state (D) in its energizing 
is the principal nonassociative 
able. It represents the summation Q 
needs present at the moment, include 
ing the frustration induced drive © 
the anticipatory goal response ( 
Other motivational variables 
wholly or in part associative 
nature. The drive stimulus is 
associative variable. Differential pere 
formance based upon characteristi 
cally different internal cues is a conse 
quence of associative learning (Hull, 
1933; Leeper, 1935). The anticipas 
tory goal response has both associa 
tive and nonassociative functions, 
Its response produced stimulus, Sø, 
may direct behavior by serving as 4 
source of associative strength for 
certain responses and not others. In 
this manner it has a steering effect, 
evoking specific responsesculminating — 
in commerce with a particular goal. 
Since arousal of the anticipatory 
goal response and nonattainment of 
the goal is assumed to be a source 
frustration drive which contributes to 
the effective drive state, thwarting or — 
blocking of goal responses or conflict 
among responses preventing goal at- Y 
tainment may lead to an increase in 
the energizing effects of drive (D) 
(Amsel, 1958; Berlyne, 1960; Brown, 
1961; Hull, 1952; Maltzman, 1952; - 
Marx, 1956; Seward, 1950; Spence, 
1960). 
Another distinction of particular. 
importance is between process and 
state variables. A state variable 
refers to a relatively permanent Or 
persistent disposition of the organism - 
where the antecedent conditions arè 
previous interactions of the organism 
and its environment (Maltzman, 
1952; Spence, 1948). Learning is a 
state variable, so also is a hunger 
need. 


MOTIVATION AND DIRECTION OF THINKING 


Process variables "represent, not 
states, but hypothetical, nonobserv- 
able responses, implicit processes, 
occurring in the individual" (Spence, 
1948, p. 76). A further characteristic 
of process or response produced needs 
are that they change relatively 
rapidly as a function of time since 
their inception. They are liabile 
rather than persistent dispositions 
to behave in a particular manner. 
Frustration or conflict induced mo- 
tives are of this kind as well as 
determining tendencies or Aufgabe 
and other kinds of verbal response 
produced stimuli. These process 
needs represent by far the most 
important sources of motivation for 
thinking. 


Associative Factors and 
Directed Thinking 

If the simplifying assumption of 
treating thinking asinvolving changes 
within and between habit family 
hierarchies is adopted, then much of 
of the S-R discussion of motivation, 
particularly the Hull-Spence formu- 
lation, is immediately applicable to 
the problem of motivation and think- 
ing. 
The importance of the distinction 
between associative and nonassocia- 
tive variables may become more ap- 
parent if the problems of the direc- 
tion of thinking are stated as formu- 
lated by Humphrey (1940). He has 
characterized some of the problems 
which require explanation as follows: 

1. What determines the orderly 
succession of thoughts. 

2. What energizes the succession 
of thoughts. 

3. What keeps the successive 
thoughts relevant to the problem at 
hand. 

Humphrey's survey of theories of 
thinking circa 1940 indicated that 
each had difficulties in accounting for 
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these phenomena. However, at that 
time characterizations of complex be- 
havior in S-R terms were already 
available. A decade had passed since 
Hull's (1930) theoretical analysis of 
simple trial-and-error learning was 
published. There is a striking simi- 
larity in Hull's statement of the prob- 
lems posed by trial-and-error learning 
that demand explanation and those 
of directed thinking. 

1, What principle determines the 
order of appearance of the different 
acts. 


2. What determines the organism's 
persistence in responding even after 
repeated failures. 

3. What principle limits the range 
of reactions which an organism will 
make to a problem. 

Hull's interpretation of the above 
characteristics of trial-and-error be- 
havior was as follows: 

1. The act is evoked at any given 
state of the trial-and-error process 
which is strongest at the moment. 

2. The organism persists despite 
failure because the stimulus situation 
which evokes the acts itself persists. 

3. The range of reactions which 
may be evoked by a given problem 
situation is limited to the reactions 
which have become conditioned dur- 
ing the life of the organism to one or 
another stimulus components of that 
situation. : 

Hull in later years specified addi- 
tional internal sources of persistent 
stimulation responsible for the per- 
sistence of trial-and-error behavior. 
These were the drive stimulus, Sp, 
and the sg, stimulation produced by 
the anticipatory goal response. Still 
later he introduced the concept of 
drive state, D, energizing the habits 
present at a given moment. 

It is significant that at this informal 
descriptive level all of the characteris- 


tics of directed thinking presented by 
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Humphrey may be accounted for in 
terms of Hull's theory of trial-and- 
error behavior. It should be further 
noted that all of the characteristics 
of directed or motivated thinking 
cited are accountable for in terms of 
associative variables. The question 
intruding itself, of course, is whether 
in fact it is necessary to hypothesize 
an effective drive state energizing 
thinking. Even if experimental evi- 
dence suggests that such a concept 
is necessary, much of what is con- 
sidered motivated thinking can be 
accounted for in the usual associative 
terms without invoking uniquely dif- 
ferent motivational or drive concepts. 
Excellent analyses of this problem in 
different contexts have already been 
presented by Brown (1961), Farber 
(1955), and Postman (1953), among 
others. Farber (1955), for example, 
has shown how many studies of 
motivation in imaginative produc- 
tions such as achievement motivation 
can be adequately explained in 
strictly associative terms. 

Despite the marked similarity be- 
tween the characteristics of directed 
thinking and simple trial-and-error 
learning, it is apparent that there are 
differences between the two. Hum- 
phrey (1940) has not exhaustively 
described the characteristics of di- 
rected thinking. For one thing, reac- 
tions to one or another component of 
à problem situation may occur which 
have never been conditioned to 
them. Shifts in the entire direction 
of thinking may occur in an unchang- 
ing problem situation. There may be 
no external stimulus situation con- 
stituting a problem at the moment a 
given direction of thinking occurs, or 
the problem situation is an unrelated 
one. But each of these additional 
characteristics of directed thinking 
may be accounted for in associative 
terms. Of particular importance in 


this connection are the concepts 
of conditioned generalization, com- 
pound habit families, and response 
produced cues. The question still 
remaining, however, is whether there 
are any characteristics of motivated 
thinking that require the introduction 
of nonassociative constructs. We 
think that there are. Although cur- 
rently available experimental evi- 
dence is exceedingly meager, we 
would assume that nonassociative 
variables of the process need variety 
which are a consequence of verbal 
response produced stimulation are 
necessary for an adequate account of 
certain characteristics of directed 
thinking. 

Conflict of incompatible response 
tendencies as a source of motivation 
or drive has repeatedly been of 
interest in the recent history of psy- 
chology. A number of different 
formulations of conflict induced 
drives have been proposed by con- 
temporary theorists (Amsel, 1958; 
Berlyne, 1960; Brown, 1961; Marx, 
1956; Seward, 1950; Spence, 1960), 
and some such conception was a basic 
one in the historical school of func- 
tional psychology. Dewey (1895) 
formulated a conflict theory of emo- 
tions, and the notion that thinking 
arises when there is a thwarting of 
habitual responses (Dewey, 1933) 
suggests an intimate relation between 
thinking and drive. Bartlett (1925) 
has explicitly considered the inter- 
dependence of the two variables from 
a consciousness centered point of 
view. Berlyne (1960) has recently 
provided the most extensive analysis 
of conflict in relation to thinking. 
With characteristic acuteness he has 
analyzed many different situations in 
terms of his conceptions of conflict 
and has integrated much of the recent 
research on curiosity and exploratory 
behavior in this country and the 
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work on the orienting reflex in Russia. 
Unfortunately at the present time his 
ingenious formulations are largely 
speculations in the sense that they go 
considerably beyond the experi- 
mental data. There is a striking lack 
of unambiguous data on the influence 
of conflict induced drives upon think- 
ing. 

Research that is the closest approx- 
imation to this problem are the 
studies of the effects of frustration 
induced by insoluble problems on 
subsequent problem solving behavior. 
The hypothesis is that arousal of an 
anticipatory goal response, verbaliza- 
tion of a goal or problem solution 
(Maltzman, 1955) induces something 
akin to frustration drive and is a 
source of reinforcement when a solu- 
tion is attained. This terminates the 
sequence of thinking. 


Frustration Induced Drive 
and Thinking 


Although drive has an energizing 
function, this is not to say that it 
necessarily is facilitating. Since drive 
(D) indiscriminately multiplies all or 
most habits present at the moment, 
successful problem solving behavior 
may be facilitated or inhibited de- 
pending upon the nature of the habit- 
family hierarchies involved. To con- 
duct experimental studies of drive 
and thinking, it is necessary that 
there be independent control of the 
antecedent conditions related to the 
hypothetical variables of drive and 
habit-family hierarchy. Failure on 
insoluble problems meets the require- 
ments for conflict induced drive, and 
the so-called mental set or Einstel- 
lung experiment permits the experi- 
mental manipulation of the habit- 
family hierarchy. 

A common procedure employing 
Luchin's (1942) water jar problem is 
to present subjects with a series of 
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problems all solvable by means of the 
so-called long solution. As a result of 
training, this solution acquires the 
greatest habit strength. When a 
problem is presented in which an 
alternative short solution is possible 
the dominant tendency as the result 
of training is to give the long solu- 
tion. If the effective drive state is 
increased by frustration, the absolute 
difference in reaction potential for 
the long and short solutions will in- 
crease. There will be a greater 
tendency for the long solution to 
occur as a consequence. Cowen 
(1952) and Pally (1955) have em- 
ployed essentially this procedure ob- 
taining results in accord with the 
drive interpretation. 

In the absence of a clearly estab- 
lished habit-family hierarchy, Rhine 
(1957) found that failure on pretest 
anagrams produced a significant dec- 
rement in test anagrams. There was 
a tendency for a greater frustration 
decrement when failure and test ana- 
grams belonged to the same response 
class than when they belonged to 
different response classes. These re- 
sults suggest that associative vari- 
ables at least in part are involved in 
failure produced interference with 
problem solving, and would make a 
drive interpretation superfluous in 
the absence of additional evidence of 
facilitation effects. 

Solly and Stagner (1956) have 
found that solution times for an 
anagram increased significantly as a 
function of prior failure on insoluble 
anagrams. Again, an associative as 
well as a nonassociative or drive 
interpretation may be given these 
results. Evidence supporting the 
drive notion, although by no means 
definitive, is the fact that measures 
of palmer sweating taken before and 
after the experiment revealed that 
the failure group showed a signifi- 
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cant increase in palmer sweating as 
compared to a control group. 
Glasser (1958) also employed ana- 
grams as test materials in a study of 
the effects of failure. He frustrated 
subjects with dissimilar stimulus ma- 
terials, the water jar problem. Reli- 
able evidence was obtained for facili- 
tation following failure when the 
previously established dominant ana- 
gram set remained dominant. He 
likewise found that the galvanic skin 
response increased as a consequence 
of problem solving failure. 
Although these studies of failure 
frustration may be given a drive 
interpretation, they do not unam- 
biguously support such a conception. 
As Child and Waterhouse (1953) 
have indicated, individuals may learn 
to perform differentially in the pres- 
ence of frustration produced cues. 
Frustration may be learned as a cue 
for more effective thinking. Such an 
associative interpretation of the con- 
sequences of failure in, for example, 
Cowen’s (1952) and in Glasser's 
(1958) studies must be ad hoc since 
the specific conditions under which 
facilitation or interference occurs on 
an associative basis are not stated. 
A drive interpretation of frustration 
as formulated within the Hull-Spence 
theory of behavior more explicitly 
states the conditions under which 
these opposed effects of frustration 
are to be expected. But this as well 
as other S-R theories also assumes 
that drive stimuli may serve as 
internal cues. Specification of the 
precise conditions under which frus- 
tration induced cues are conditioned 
to different behavior patterns and the 
conditions under which these be- 
haviors are reinstated is a serious 
problem for behavior theory gen- 
erally. A clear demonstration of 
frustration drive energizing problem 
solving would require a factorial de- 


sign with dominant correct and in- 
correct mental sets, habit-family 
hierarchies, evoked following condi- 
tions of failure and nonfailure. It 
should be possible to demonstrate for 
the same subjects that frustration 
may result in superior or inferior 
problem solving depending upon the 
interaction of drive and dominant 
habit structures present at the mo- 
ment. Glasser (1958) designed his 
experiment in this manner, but could 
not perform an adequate test because 
he failed to obtain a dominant set for 
the incorrect solutions. Maltzman, 
Fox, and Morrisett (1953) obtained 
facilitation and interference with 
manifest anxiety as an irrelevant 
drive, but their experiment is open to 
the criticism that the opposed effects 
were obtained with different groups 
of subjects and with two different 
kinds of problems. That the pre- 
dicted effects of frustration induced 
drive on problem solving may be ob- 
tained is indicated by the unambigu- 
ous results of Castenada and Lipsitt 
(1959) employing a simple motor 
learning situation. 

Another relevant condition when 
considering the effects of frustration 
drive on problem solving is the simi- 
larity between the failure and test 
situations. Whether or not the cor- 
rect habit family is dominant, decre- 
ments in performance may occur if 
subjects are frustrated with highly 
similar problems whereas facilitation 
would only occur when failure is in- 
duced with different materials. Un- 
der the latter condition competing 
responses are associated with the 
failure and not the test situation. 
Under such dissimilar stimulus con- 
ditions only the nonspecific drive 
state could influence performance 1n 
the test situation. Support for this 
hypothesis stems from animal studies 
of the effects of shock induced emo- 
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tionality which have shown that rats 
shocked in one situation show a sig- 
nificant increase in water intake when 
tested in a different stimulus situation 
(Amsel & Maltzman, 1950). When 
shocked and tested in the same situa- 
tion they show a decrement in con- 
summatory behavior (Amsel & Cole, 
1953). 

Murdock (1952) has conducted a 
study of the effects of failure which is 
not concerned with frustration drive 
asan energizer but as a source of rein- 
forcement for avoidance learning. 
Employing a mediated generalization 
procedure he demonstrated that fail- 
ure associated with particular re- 
sponses retarded subsequent percep- 
tual motor learning where these 
responses could serve as mediators of 
new learning.  Avoidance of the 
mediating responses associated with 
failure presumably was reinforcing. 
Extension of the analysis of avoid- 
ance of implicit verbal responses may 
serve as the basis for so-called sup- 
pression and repression in thinking 
(Dollard & Miller, 1950). 

In contrast to the previous experi- 
ments, Kendler, Kendler, Pliskoff, 
and D'Amato (1958) have conducted 
an ingenious experiment suggesting 
the energizing role of a positive in- 
centive. The problem solving situa- 
tion that they employed required the 
assembly of previously isolated habit 
segments on the part of preschool 
children. Such reasoning was found 
to be dependent upon the presence of 
a positive incentive as well as rein- 
forcement of the separate habit seg- 
ments. 


Determining Tendency and Process 
Needs 


Historically the most prominent 
concept employed in the analysis of 
directed thinking has been mental set, 
a term used here initially to include 
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terms such as Aufgabe, determining 
tendency, and goal idea, etc. The 
kind of behavior usually described as 
a manifestation of mental set is an 
increased frequency of occurrence of 
a particular class of responses. A con- 
comitant behavioral change often 
used as a criterion of mental set is a 
decreased frequency of occurrence of 
a new class of responses. It is obvious 
that such behavioral changes can be 
produced by a variety of antecedent 
conditions such as those commonly 
related to learning, motivation, gen- 
eralization, in other words all of the 
variables known to influence be- 
havior. 

Two kinds of set may be distin- 
guished on the basis of their ante- 
cedent conditions and functional 
relations to consequent behavior 
changes. These correspond to state 
and process variables. The learning 
of response classes or the acquisition 
of habit strength by a class of stimuli 
for the elicitation of a particular class 
of responses is of the state variable 
kind. A second kind of set that has 
been studied employs task instruc- 
tions which produce a change in the 
compound habit-family hierarchy by 
increasing the reaction potential of a 
class of responses through arousal of 
their anticipatory goal response. 


Sets developed through training are based on 
the growth of habit strength, and therefore 
develop relatively slowly and have a degree of 
permanence. In contrast, sets induced by in- 
structions are labile, are aroused rapidly and 
need not persist (Maltzman, Eisman, Brooks, 
& Smith, 1956, p. 420). 


The latter kind of set or determining 
tendency possesses the character- 
istics of a process need. Arousal of the 
anticipatory goal response presum- 
ably produces an immediate increase 
in reaction potential for all its associ- 
ated responses, the habit-family hier- 
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archy of which it is a member. Such 
an effect occurs because the goal re- 
sponse enters into a multiplicative 
relationship with habit strength in the 
determination of reaction potential 
(Maltzman, 1955). It constitutes a 
major source of drive in the energiz- 
ing sense, and directs thinking by 
energizing primarily those responses 
which constitute a common habit- 
family hierarchy, a common core of 
intraverbal associations. Arousal of 
the anticipatory goal response and 
continued nonattainment of the goal, 
frustration, produces further increases 
in drive. Frustration may be the sole 
antecedent source of drive, since in 
problem solving by definition a cor- 
rect response does not initially occur. 
Solution of a problem terminates the 
goal response, reduces frustration, 
and is thereby reinforcing. Whether 
all thinking is goal directed or not we 
do not know. In part this question is 
a matter of definition, how "'goal" is 
specified. 
Experiments demonstrating that 
instructions may produce a rapid 
increase in the probability of a re- 
sponse class have been reported by 
many investigators (e.g., Hunter, 
1956; Maltzman et al., 1956; Maltz- 
man & Morrisett, 1953; Rees & 
Israel, 1935) since the Wurzburg 
School (Humphrey, 1951; Titchener, 
1909). In problem solving studies, it 
has repeatedly been demonstrated 
that merely instructing subjects at 
the start of a session that the ana- 
gram solutions all belong to a par- 
ticular class, such as nature words, 
produces reliable evidence of facilita- 
tion. It may be argued that the effect 
is obtained because the instructions 
reduce the range of appropriate re- 
sponses by eliminating many incor- 
rect solutions before they are at- 
tempted. This is quite correct, but 
the basis for the effect is the one 


previously stated. A particular class 
of solution responses is immediately 
increased in reaction potential, be- 
comes dominant in the compound 
hierarchy of habit families elicitable 
by the task. 

Determining tendencies as a form 
of process need are often instigated by 
the instructions or verbal response 
produced cues of the experimenter. 
Another pervasive form of process 
need has gone under a variety of 
different labels such as prompts, hints, 
recency, priming, etc. (Cofer, 1960; 
Maltzman & Simon, 1959; Osgood, 
1957; Skinner, 1957; Storms, 1958). 
Judson and Cofer (1956) have dem- 
onstrated the effect employing the 
Four-Word Problem Test. Two of the 
words in each item of this test are 
ambiguous in that they are members 
of two different relevant response 
classes or habit-family hierarchies 
while the other words are unambigu- 
ously a member of only one of the 
two relevant classes. The subjects' 
task is to indicate which word is un- 
related to the other three in the item. 
An illustrative item is the following: 
“Add” “Subtract” "Multiply" In- 
crease." Add and Multiply are am- 
biguous words in that they belong to 
both the class of arithmetic operations 
and of growth functions. Subtract 
and Increase are unambiguous in that 
the former belongs only to the rele- 
vant class of arithmetic operations 
while the latter belongs only to the 
relevant class of growth. 

The hypothesis tested was that 
whichever of the unambiguous words 
occurred in an item first would acti- 
vate or differentially increase the 
reaction tendency for that class of re- 
sponses, resulting in the exclusion 0 
the second unambiguous item as un- 
related to the other three words. 
Thus if Subtract appeared in the 
item before Increase, the hierarchy 
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of arithmetic operations would be 
differentially facilitated and Increase 
would be eliminated as irrelevant. If 
Increase occurred first the response 
class to which it belongs would receive 
an increment in reaction potential so 
that Subtract would be eliminated. 
The word order of the unambiguous 
terms was systematically varied. 
Results obtained in two studies indi- 
cated that the unambiguous word 
which appears first is less frequently 
eliminated than the second regardless 
of the particular word involved or the 
particular position in the item. These 
results indicate that the occurrence of 
a single verbal response increases the 
probability of occurrence or avail- 
ability of a class of associated re- 
sponses or habit family. 

Another problem situation re- 
ported by Judson, Cofer, and Gelfand 
(1956) yielded data again suggesting 
that the occurrence of a given re- 
sponse increases the likelihood of oc- 
currence of associated responses. 
Here, reinforcement in a verbal prob- 
lem of one member of a word-associa- 
tion hierarchy increased the frequency 
of selection of associations from that 
same hierarchy when they appeared 
individually in different verbal prob- 
lems. 

Another study (Maltzman & 
Simon, 1959) has shown that the un- 
commonness of responses to a word- 
association list increases as a function 
of the recency of association to differ- 
ent stimulus words. 

Systematic research on the param- 
eters of response-produced process 
needs is urgently needed, and repre- 
sents one of the most intriguing prob- 
lems in the area of thinking. It also 
represents what appears to be a prob- 
lem largely peculiar to verbal be- 
havior. Considerably more research 
is needed before these verbal response- 
instigated changes can securely be 
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characterized as involving changes in 
drive state (D). Particularly perti- 
nent in this respect would be system- 
atic research designed to determine 
whether or not decrements in process 
needs contribute to the acquisition 
of habit strength, whether they may 
serve as the basis for learning. 

One last source of verbal response- 
instigated needs must be mentioned, 
one which is of considerable practical 
and theoretical significance, but which 
has received much less study in this 
country than in the Soviet Union. 
This is the changes in drive states 
that may be induced as a consequence 
of the instigation of physiological 
changes conditioned to verbal re- 
sponse-produced stimuli. Razran's 
(1961) review of research in Russia, 
as well as translations of such work, 
attest to the profound influence con- 
ditioning may have upon physio- 
logical functioning. Of particular 
pertinence to our present discussion 
is the role of verbal response-produced 
stimuli or the second signal system as 
complex conditioned stimuli for phys- 
iological responses. For example, 
Platonov (1959) reports verbally con- 
ditioned pain reactions as well as 
analgesia, the latter condition known 
for many years to be produced under 
hypnosis. But as Platonov convinc- 
ingly argues, hypnosis is a form of 
verbal conditioning. Among other 
physiological changes, Platonov re- 
ports that the leucocyte count may be 
increased as well as reports of feelings 
of hunger by suggesting to the subject 
that he is hungry, by employing 
words as conditioned gi for these 

hysiological changes. Appropriate 
aaa suggestion likewise produced a 
drop in the leucocyte count. Experi- 
mental studies are clearly needed in 
which an attempt is made to verify 
these findings and to determine 
whether performance under different 
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conditions such as classical and 
operant conditioning, simple verbal 
learning, and problem solving, are 
affected in the predicted manner if 
changes in drive are in fact induced. 

The area of thinking is of central 
importance in the study of human be- 
havior. There has been an obvious re- 
awakening of interest in problems in 


this area during the past decade and 
some signs of real progress. Within 
this relatively neglected area motiva- 
tion has been a most neglected prob- 
lem. This state of affairs will surely 
change. But it will change for the 
better only if there is systematic ex- 
perimental research integrated with 
rigorous behavior theory. 
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FORMATION AND TO CONCEPT CONTENT 
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Motivation, in current psycho- 
logical theory, has two general effects. 
To motivate is to arouse; to cause the 
organism to attend more closely to 
the environment. To motivate is also 
to influence the direction of the or- 
ganism's attention; to increase the 
probability that the organism will 
respond to one class of stimuli rather 
than other classes of stimuli. 

It is the directional aspect of moti- 
vation as it is related to concept for- 
mation that shall be discussed in this 
paper. It is generally known that 
abstract concepts are more difficult 
to learn than are concrete ones. One 
means of explaining this empirical 
fact is to invoke various develop- 
mental theories. The purpose of this 
paper is to discuss these theories, to 
examine the data supporting these 
theories, and to offer an alternative 
explanation of observed differences 

in the adequacy of attainment of con- 
crete and of abstract concepts. This 
explanation is based on certain ideas 
of Benjamin Lee Whorf (1956) which 
form the basis for a theory of lin- 
guistic determinism. 

The formation of aconcept involves 
the development of awareness of 
characteristics common to certain 
objects, attributes, or ideas, so that 
those objects, attributes, or ideas hav- 
ing elements in common may be 
grouped into a separate category or 
class. Vinacke (1952) defines concept 
formation as involving 


processes of perception and learning by means 
of which the individual develops an organized 
and coherent relation to the outside world. 
The consequences of these processes is the 
establishment of concepts, the cognitive 


structures which link the individual's present 
perceptions and learning to his previous ex- 
perience (p. 98). 


Vinacke continues, following the 
above quotation, by saying that we 
must distinguish between the process 
of concept formation and the contents 
of the concepts formed, a distinction 
that I will try to maintain in this 
paper. 

A number of psychologists, such as 
Piaget (1950, 1951), Goldstein and 
Scheerer (1941), and Werner (1948) 
have constructed developmental theo- 
ries bearing on the problems of con- 
cept formation and on “mental struc- 
ture." These theories differ in various 
ways, but have a number of points in 
common. The child is said to develop 
(once past a strictly sensory-motor 
stage) from a concrete-perceptual to 
an  abstract-conceptual level of 
thought. The change from concrete- 
ness to abstractness is a relatively 
sudden, saltatory two-stage process, 
with mature abstractness in thought 
first occurring at about age 12, as à 
result of unspecified changes in 
mental structure. 

Since definitions of concreteness 
and of abstractness vary, the defini- 
tions used herein are offered below. 
'The formation of concrete concepts 
involves an organization of experi- 
ence in which the grouping of per- 
ceptual elements of the stimulus 
situation is sufficient for adequate 
categorization. The formation 0 
abstract concepts involves some form 
of classification of experience for 
which sensory experience is insuti- 
cient, in itself, as a basis for accurate 
categorization, Further organization, 
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beyond the purely perceptual level, 
is required before adequate concep- 
tualization can occur. Concrete con- 
cepts are generally horizontal; ab- 
stract hierarchical or vertical in na- 
ture. Such concepts as size, color, or 
shape are examples of concrete con- 
cepts, while such concepts as justice, 
causality, and life (this last referred 
to as animism when incorrect in con- 
cept content) are examples of abstract 
concepts within this definition. There 
is certainly more to concreteness and 
abstractness than this, but this defi- 
nition is adequate in the sense that all 
of the concrete- abstract theorists 
cited would probably accept it as a 
minimal definition. 

It may be muddying the waters at 
this point, but I would like to suggest 
that there are three, not two, levels of 
concepts along this concrete-abstract 
dimension. The first, most concrete 
level is that which refers to the at- 
tributes of objects, such as size, shape, 
or color. A second stage, while still 
concrete-perceptual, involves those 
stimulus characteristics of objects 
that are more dependent on external 
frames of reference, such as quantity 
and position. The third stage, in this 
division, would be, as in previous 
categorizations, the abstract-concep- 
tual level for which purely perceptual 
organization is insufficient. 

From the developmental views of 
Piaget, Goldstein, Scheerer, and 
Werner, we would expect concrete 
concepts to be formed earlier than 
abstract ones. Since the development 
of abstract concepts depends on a 
qualitative, agebound, presumably 
physiological change in mental struc- 
ture, we would expect that: 

1. No abstract concepts will be 
formed by children before approxi- 
mately age 12. 

2. To the degree that concepts are 
equally abstract, they should be 
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formed at about the same time by an 
individual. 

3. Individuals within a culture 
should not vary markedly in age of 
concept attainment. 

4. There should be little cross cul- 

tural variation in age of concept at- 
tainment between members of vari- 
ous nonprimitive societies. 
I include the phrase nonprimitive,“ 
since for some theorists, such as 
Werner, all primitive people, even 
when adults, are "concrete." Per- 
haps, as Roger Brown (1958, p. 297) 
suggests, we have the unfortunate 
habit of labeling the thoughts of 
others as concrete, if the thoughts 
follow unfamiliar lines, while reserv- 
ing the accolade of abstract for our 
own brand of thinking. 

Let us examine the development of 
some abstract concepts to determine 
whether they fulfill these expecta- 
tions. 

Piaget (1929) investigated animism 
in children, attempting to measure 
developmental changes in the kinds 
of objects that subjects believed to be 
alive. Children first believed any- 
thing active to be alive; e.g., a tele- 
phone is alive, at least when it is ring- 
ing. Later in development, life is 
attributed only to objects that move. 
Still later, life is attributed to objects 
that move without visible external 
stimulation, such as the sun or the 
wind. Finally, consciousness or life is 
restricted to plants and animals by 
about age 12. Most investigators 
(e.g., Russell, 1940; Russell & Dennis, 
1939) have found these stages to exist 
and have found some age progression, 
but have also shown considerable 
variation in the level of explanation 
used at any age level. Dennis (1942) 
found his daughter (IQ 150) passed 
through Piaget’s four stages, reaching 
an adult level of understanding of the 
concept, "life" at 6 years 2 months. 


470 


Dennis (1953) also studied college 
students and found that a consider- 
able proportion of them retained some 
animistic beliefs. In one of the rela- 
tively few cross cultural studies (as 
opposed to observations, of which 
there are a great number) of ani- 
mism, Huang and Lee (1945) found 
almost no evidence of animism in 
Chinese children. While the results 
depend, to a considerable extent, on 
the procedure followed, intra- and 
intercultural studies suggest no uni- 
versal, saltatory age change in con- 
cept content. 

An aspect of a concept of justice, or 
of moral judgment, is moral realism. 
Moral realism is the tendency to 
judge acts exclusively in terms of con- 
sequence without regard for motives. 
Maine (1861) says that the history of 
jurisprudence is that of a gradually 
increasing concern for motives. We 
have not yet fully evolved to what 
Maine (1861) or Piaget (1932) would 
call complete moral maturity in this 
area, even as adults. Certain age 
changes in moral judgment fit the 
concrete-abstract explanation of 
changing mental life. Piaget (1932) 
finds young children are morally 
realistic, while older children are not. 
This is a result, in part, of the con- 
creteness of children's thought proc- 
esses. The change toward a concern 
for motives is complete, according to 
Piaget, at about age 12. Yet very 
young children in the Chinese culture 
have already advanced beyond this 
stage (Liu, 1950). The advancement 
of Chinese children beyond Cau- 
casians of the same age is a result, 
says Liu, of a culture and a philoso- 
phy that require acts to be considered 
in terms of cause or motive. On the 
other hand, high school children in our 
culture still show many evidences of 
moral realism (Johnson, 1962). 
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Causality is another concept stud- 
ied by Piaget (1930) and others, The 
child moves from magical explana- 
tions in terms of some animistic voli- 
tion of the object (e.g., the candle in 
the airtight jar goes out because it is 
tired) through a large number of types 
of explanation to a final adult con- 
ception of causality. Here again there 
is evidence that age progression in the 
achievement of accurate concept con- 
tent is not as neat as Piaget suggests, 
since almost all forms of explanation 
are present among children by about 
age 8 (Deutsche, 1937). On the other 
hand, we, as adults, seem to be by no 
means “out of the woods" in our 
handling of causality. 

I would like to mention one more 
abstract concept, that of "time," 
While time has concrete referents, 
such as day, night, and moon, many 
more distinctions, only some of them 
tied to clock time, are possible. Ames 
(1946) reports that by about age 8 
the superior child's understanding of 
the time concept is essentially à 
mature one. , 

I have presented data concerning 
four concepts that usually are con- 
sidered to be abstract ones. The dis- 
cussion has centered largely on con- 
cepts first studied by Piaget, since 
Piaget's variety of concrete-abstract 
theory is the one for which the most 
evidence is presented. To the extent 
that other theorists would accept my 
definition of abstractness, the results 
obtained in testing Piaget's ideas 
should also hold for their theories. 

Presumably, from a concrete-ab- 
stract theoretical framework, abst 
concepts should be formed only after 
about age 12. Looking at research on 
children, it seems pretty clear that 
what are usually termed abstract 
concepts are formed rather early, but 
with inaccurate or inadequate con- 
tent based on concrete perceptual 
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elements of the stimulus situation 
that are not central to the concept, 
but instead (as in the case of move- 
ment being confused with life) show- 
ing a fortuitous relation to the con- 
cept. The 5-year-old has a concept of 
life; it is just that we believe him 
wrong in concept content when he 
attributes life to a moving locomo- 
tive. 

Abstract concepts may be gener- 
ally formed so that concept content 
is accurate during the period when a 
child is said to be tied toconcrete- 
perceptual categorizations of events, 
as in the case of the concept of time 
achieved by superior 8-year-olds. 
Individuals within a culture vary 
greatly in the age at which adequate 
concept content is attained, as for 
example, in concepts of causality and 
animism. Further, considerable vari- 
ation exists between subjects from 
different cultures in the age at which 
a given level of concept content is 
attained. These data do not seem to 
argue in favor of a theory of concept 
formation based on apparently innate 
saltatory changes in mental structure 
that should occur at approximately 
the same age for all individuals in 
nonprimitive cultures. 

I do not believe that the develop- 
mental theorists whom I have dis- 
cussed have proven their position to 
be correct. In fact, the evidence 
appears to be against their point of 
view. 

I would now like to present another 
possible explanation for the fact that 
the content of concrete concepts is 
generally easier to learn than the 
content of abstract concepts. In this 
connection I shall discuss concept 
formation and concept content from 
what is usually referred to as a 

orfian point of view. This position 
(Whorf, 1956) is that higher levels of 
thinking are dependent on language 


probability is increased that these 
objects or events will be categorized 
in some other fashion. 

However, the availability of a very 
large number of words to denote ob- 
jects falling within one potential class 
or category also appears to decrease 
the probability of forming a higher 
level concept encompassing the whole 
class or category, as shown, for ex- 
ample, in the Lapps having no generic 
name for snow (Werner, 1948) and 
the Bakairi of Brazil having many 
names for kinds of parrots but no 
name for the class that we would call 
“parrot” (Brown, 1958). The rela- 
tion between the number of words 
available to denote objects that are 
similar esr dimension and ry 
probability developmen nent of a 
concept encompassing this dimension 


more likely to be formed within the 
broad range from very few to very 
many words. 

It seems likely that it is in the area 


accurate notion of concept content, 
whether as a result of the greater 
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availability of symbols or of increased 
practice at discrimination. 

The "goodness" of a concept, in 
terms of allowing accurate categoriza- 
tion of objects or events, would de- 
pend on two things. The first, already 
mentioned, is the number of words 
available for categorization of objects 
into any particular higher level con- 
cept. The second is the exclusiveness 
of the words used to denote the essen- 
tial elements of the concept. For 
example, a characteristic of auto- 
mobiles is movement. Movement is 
not, however, a good term to use as a 
verbal base for the concept auto- 
mobile. Automobiles are still auto- 
mobiles even when not moving. 
Many other objects move. 

The adequacy of a concept, in 
terms of content, would, from a 
linguistically relativistic point of 
view, depend on the number of words 
available to describe elements of the 
concept and on the exclusiveness of 
these words. I have some data I wish 
to present, regarding the number and 
the exclusiveness of English words 
that are related to various concrete 
and abstract concepts. This study 
was modeled after that of Brown and 
Lenneberg (1954). In the Brown and 
Lenneberg study 10 judges were pre- 
sented with the entire series of 
Munsell colors at the highest level of 
saturation and asked to map the color 
regions referred to by the color names 
of red, orange, yellow, green, blue, 
purple, pink, and brown and also to 
choose one color chip in each of these 
color areas that best represented that 
color. A high amount of agreement 
was shown between judges in map- 
ping color areas and in choosing 
"ideal" chips. These 8 ideal chips 
plus 16 other chips that were re- 
moved, to varying degrees, from the 
central or ideal colors of the various 
color areas were presented to 24 sub- 
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jects who were asked to name each of 
the 24 colors as it was presented. 
Five measures were taken during 
color naming. These were: (a) aver- 
age syllable length of the names 
given to each color, (b) average num- 
ber of words used to name each 
color, (c) latency of color naming, 
(d) amount of agreement between 
subjects in color naming, and (e) test- 
retest reliability of names given each 
color. All of these measures were pos- 
itively correlated with one another, 
forming a g factor called codability. 
The amount of agreement between 
subjects proved to be the best single 
measure of codability. The chips of 
highest codability were the ideal in- 
stances from within the eight color 
regions. 

Brown (1958) says: 

It is not so clear how to plot a category like 
chair because we are not sure of the defining 
attributes of this category. . . It is proposed 
that central stimuli will be highly codable and 
peripheral stimuli less codable. These 
thoughts are offered as blueprints for a set of 
psychological laws relating category codabil- 
ity to category availability and stimuli coda- 
bility to category centrality (p. 241). 

From this theoretical framework 
and from the above results, we would 
expect that the number of verbal 
symbols available to describe attri- 
butes of a stimulus or the position of 
a stimulus along the dimension Or 
experience denoted by a concept 
would be positively correlated with 
the codability and discriminability of 
stimuli along this dimension. The 
amount of agreement among sub- 
jects regarding symbols available 1n 
any area of conceptualization shoul 
indicate the amount of interpersonal 
agreement as to the relevant compo- 
nents of the concept. For example, 
given the concept automobile, We 
could produce many terms describing 
this concept, ranging from words like 
“Ford” and “4-door” through "MG 
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and “saloon model" and down to in- 
frequent responses like Essex“ and 
“phaeton.” Within the Whorfian 
framework, the more relevant words 
we could find, the more adequate the 
understanding of the concept. The 
more agreement between us in terms, 
the more likely we will be talking of 
the same thing when we talk of auto- 
mobiles in a generic sense. The ex- 
clusiveness of the words used would 
also be indicative of the adequacy of 
concept content. If to “automobile” 
we responded with "truck," “trac- 
tor,” "weapons carrier," and terms of 
this sort, we would be overinclusive 
and incorrect in our concept. 

It is possible to measure abstract 
and concrete concepts with regard to 
their codability and the adequacy of 
concept content. I chose seven con- 
cepts to examine. Three of them, life, 
cause, and time, are concepts that 
should, within Goldstein's definition, 
require abstract conceptual thought 
for adequate development. 

Twenty-seven subjects, all college 
students, were presented with the 
following instructions: 

In our lives we must learn many concepts, 
some of which have to do with the grouping 
or categorization of concrete objects, others 
with such abstract qualities as justice. A 
number of concepts will be presented to you 
on the next two pages. Following each con- 
cept are two words associated with it, indi- 
cating how we might describe an object or 
event in terms of that concept. Your task is 
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to write other terms for each of the concepts. 
You have three minutes for each concept. I 
will let you know when one minute is up, two 
minutes are up, and when the three minutes 
have been concluded, and will tell you when 
to begin and when to finish for each concept. 


The subjects were then presented 
with the seven concepts arranged on 
two sheets of paper, in random order 
with the order varying between sub- 
jects. Each concept was followed by 
two words that might be used to 
describe an object or event in terms 
of the concept; e.g., shape—triangu- 
lar, blunt; life—animate, alive. 

The responses of the 27 subjects 
were placed on alphabetized tally 
sheets. The total number of different 
words used to describe attributes of 
stimuli or the position of stimuli 
along the dimensions of experience 
denoted by the various concepts were 
obtained. It had been hypothesized 
that the amount of agreement among 
subjects regarding symbols available 
in any area of conceptualization 
would indicate the amount of agree- 
ment between subjects in concept 
content. It wasassumed, inthisstudy, 
that the number of symbols attached 
to a conceptual dimension by more 
than 10% of the subjects was a crude 
but useful measure of degree of agree- 
ment between subjects in concept 
content. The results are presented in 
Table 1. 


TABLE 1 
SYMBOLS ATTACHED TO VARIOUS CONCEPTS 
Color Shape Position Quantity Time Cause Life 
Number of Different Descriptive Words 
186 92 247 153 177 192 173 


Number of Descriptive W 


ords Used by Three or More Subjects 
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TABLE 1—Continued 


Percentage of Descriptive Words Used by Three or More Subjects 


32 41 27 29 21 5 10 


Words Used by Three or More Subjects 


Aqua Acute Among All Annual Beginning Active 
Amber Big Around Alot Autumn Build Breathing 
Auburn Circle Away Amount Biennial Create Conscious 
Blue Cube About Any Century Commence Creative 
Brown Circular Amid Bunch Day Do Energy using 
Beige Diamond After Bushel Decade Push Existing 
Bright Flat Above Couple Eternity Reason Feeling 
Blue-green Fat At Either Evening Start Generate 
Baby blue Heart Across Each Forever Stimulate Growing 
Chartreuse Hexagon Below Every Future Has emotions 
Chinese red Irregular Beside Few Hour Learning 
Charcoalgrey Jagged Beneath Foot Infinity Lively 
Coral Long - Beyond Gallon Long Living 
Cinnamon Large Back of Great Light year Moving 
Chocolate Narrow Betwixt Gross Later Not dead 
Cream Oblong Before Group Month Purpose 
Dull Oval Behind Hundred Morning Thinking 
Dark Octagon By Inch Millenium 

Flesh Parallel Bottom Little Moon 

Green Pointed Close Large Never 

Gold Pear Central Lots Night 

Grey Pyramid Down Less Now 

Hazel Pentagon Diagonal More Past 

Indigo Polygon East Most Present 

Ivory Round First Much Second 

Lime Rectangular Far Million Sometime 

Lemon Rough Front of None Short 

Light Square High Numerous Semester 

Lavender Short Horizontal Number Spring 

Lilac Small Higher Neither Summer 

Magenta Sharp Here One Today 

Mauve Smooth In Ounce Tomorrow 

Maize Sphere Inside Sound Week 

Maroon Star Into Peck When 

Navy Thin Juxtaposed Pint Winter 

Orange Trapezoid Last Quart Year 

Olive Wide Low Some Yesterday 

Pink Left Several 

Purple Lower Small 

Platinum Middle Single 

Red Next to Two 

Royal blue Near Three 

Rose North Tens 

Red-orange On Thousands 

Rust Over Yard 

Sky-blue On top of 

Silver Out 
Scarlet Outside 
Sea-green Second 
Straw Straight 


Steel blue South 
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TABLE 1—Continued 


Turquoise 
Tan 


Words Used by Three or More Subjects 


Slanted 
Separated 


Toast Side 

Violet Through 

White There 

Warm Top 

Wine Third 

Yellow Together 

Yellow-green Touching 
Under 
Underneath 


While the total number of words 
used by this sample of subjects to 
describe the various concepts did not 
vary in a predictable fashion between 
concepts, the proportion of common, 
nonidiosyncratic responses did vary 
in the predicted direction. Agree- 
ment between subjects was greatest 
for the two concepts, color and shape, 
that describe concrete attributes of 
objects themselves. Agreement was 
somewhat less for subject’s responses 
describing position and quantity, 
concrete attributes that require view- 
ing the object with reference to other 
objects. Agreement in concept con- 
tent was still less for the three ab- 
stract concepts. Time, the only 
abstract concept for which adequate 
concept content is attained rather 
early in this culture, showed more 
intersubject agreement than did the 
responses to the other two abstract 
concepts. 

We find some confusion in concept 
content even in concrete concepts, 
such as shape, where common re- 
sponses include the words “big” and 

small,” which are actually descrip- 
tions of size. It is in the area of life, 
however, where inaccurate concept 


content is most noticeable. Attri- 
butes of life include such words as 
growing and moving, two terms that 
are also used to describe nonlife. Of 
all responses made, “moving” was 
made by the most subjects, parallel- 
ing the obtained responses of young 
children. 

Difficult concepts, in terms of 
reaching adequate concept content, 
are clearly concepts for which there 
are few nonidiosyncratic descriptive 
terms available, and/or for which 
common descriptive terms are not 
accurate in terms of concept content. 
While this is what would be predicted 
from a Whorfian framework, it is not 
proof that a Whorfian explanation is 
correct, since in this whole area it is 
difficult to determine what is cause 
and what is effect. Does a paucity of 
words cause us to disregard a dimen- 
sion of experience or does our disre- 
gard of a dimension of experience 
result in our having a very limited 
number of words to describe it? It 
should be possible, however, to test 
the linguistically relativistic position 
taken in this paper by comparing the 


language structure of cultures where 


a given concept is developed ade- 
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quately at an early age to cultures 
where the concept is developed ade- 
quately at a later age. It should also 
be possible to test this explanation of 
concept formation by using small 
children as subjects and building into 
their vocabularies a number of non- 
sense words describing attributes or 
objects or events not usually distin- 
guished from one another in the 
English language. 

The testability of the linguistically 
deterministic explanation is probably 
greater than the testability of the 
genetic or developmental explana- 
tion of concept formation, since the 
developmental theorists must rest, 
ultimately, on saltatory changes in 
the brain connections as an explana- 
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tion for the changes in conceptualiza- 
tion that they describe. 

To summarize, we have examined 
two explanations of empirical data 
showing that concrete-perceptual con- 
cepts are learned to an adult level of 
understanding earlier than are ab- 
stract concepts. Data do not fully 
support a developmental explanation. 
A linguistic relativistic position is 
supported, but not proved to be cor- 
rect. If the linguistic relativistic 
position is correct, then linguistic 
structure is a motivating force in the 
thought process, since it causes us to 
attend to some aspects of the environ- 
ment more closely and more accu- 
rately than to other possible dimen- 


- sions of experience. 
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IMPULSIVE VERSUS REALISTIC THINKING: 


AN EXAMINATION OF THE DISTINCTION BETWEEN 
PRIMARY AND SECONDARY PROCESSES IN THOUGHT 


ERNEST R. HILGARD 
Stanford. University 


Because we so commonly char- 
acterize Freudian psychoanalysis as a 
dynamic psychology, or a develop- 
mental psychology, or a psychology 
that emphasizes conflict or the relief 
of symptoms, we tend to translate the 
conceptions of psychoanalysis into an 
active mode. This tendency some- 
times causes us to overlook the fact 
that psychoanalysis is very largely a 
cognitive psychology concerned pri- 
marily with mental representations, 
with hallucinations and dreams, with 
memories, their distortions and re- 
pression, with attention and inatten- 
tion. Of course one might say that 
all psychology was mentalistic when 
Freud was writing, and that he was 
really talking about overt behavior 
and not about symbolic behavior. 
This I believe to be incorrect: Freud 
was very much concerned about 
symbols; his mental representations, 
condensations, displacements, and 
the rest are essentially cognitive. Itis 
appropriate for us to consider Freud's 
views in this symposium, for his 
psychology was at once a cognitive 
psychology and a psychology of 
motivation. 


paper prepared for the Symposium on 
Motivation in Thinking, Western Psycholog- 
ical Association, Seattle, Washington, June 
15, 1961; the paper has been somewhat re- 
vised since it was delivered. It constitutes a 


report from the Laboratory of Human Devel- 


' opment, established under a grant from the 


Ford Foundation. The program on 

hypnosis to which reference is made has been 

carried on with the additional support of the 

Robert C. Wheeler Foundation and the Na- 

tional tatis of Mental Health (Grant 
9). 


The basic cleavage in thought, ac- 
cording to Freud, is between two 
processes, the earlier and more primi- 
tive primary process, and the later 
more orderly, rational, and reality- 
oriented secondary process, 1 wish to 
examine this distinction to see of what 
service it might be within general 
psychology. 

The distinction between the illogi- 
cal and impulsive in thought, on the 
one hand, and the logical and ra- 
tional, on the other, is of course a very 
old one, and is not original with 
Freud. Every elementary logic 
course points out the circumstances 
that lead to fallacious thinking, and 
these include the argumentum ad 
hominem, and other kinds of argu- 
ment that permit prejudice to blind 
judgment. The notion that “the wish 
is father to the thought” did not 
begin with Freud. Contemporary 
writers, too, such as Piaget and 
Werner, arrive at distinctions be. 
tween earlier and later modes of 
thought. Hence some such distinc- 
tion as that which Freud makes be- 
tween primary and secondary process 
is plausible enough. 

The question for us to face is not 
whether this distinction is plausible, 
but whether there are novel features 
in Freud's conception that are impor- 
tant, whether the concepts are clear, 
and whether there are suggestions for 
empirical work deriving from them. 


Two PROCESSES ACCORDING TO 
FREUD 
The distinction between primary 
and secondary processes is So perva- 
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sive in psychoanalysis that it often 
receives scant mention by psycho- 
analytic writers who fully accept it. 
This may be in part because the terms 
belong to the metapsychology, and 
the clinical literature of psychoanal- 
ysis is commonly not expressed in 
these terms. The more theoretical 
discussions of psychoanalytic theory 
invariably find a central place for 
these processes. Freud's biographer 
says: "It was this distinction on which 
rests Freud's chief claim to fame: 
even his discovery of the unconscious 
is subordinate to it" (Jones, 1955, p. 
313), and the translator of his Inter- 
pretation of Dreams says in a footnote: 
“The distinction between primary 
and secondary systems, and the 
hypothesis that psychical functioning 
operates differently in them, are 
among the most fundamental of 
Freud's concepts" (Freud, 1953, p. 
601). It is of some interest, therefore, 
to review the attention that Freud 
gave these terms, and then to try to 
assess their meaning for a general 
psychology of cognition. 

Freud introduced the terms in his 
Project for a Scientific Psychology, 
prepared in 1895, but not published 
until after his death along with his 
letters to Fliess. The first mention 
was in a letter to Fliess dated Oc- 
tober 20, 1895, with reference clearly 
to the Project upon which he was then 
working. The relevant section in the 
Project is entitled “Primary Proc- 
esses: Sleep and Dreams” (Freud, 
1954, pp. 397-404). Here most of the 
later ideas are anticipated, although 
at this stage they are couched as a 
speculative neuronal theory—a the- 
ory that at least one competent re- 
viewer finds to be in many ways an 
anticipation of contemporary de- 
velopments in neurophysiology 
(Pribram, 1962). 
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The next full-scale discussion is in 
the Interpretation of Dreams, with the 
relevant section entitled “The Pri- 
mary and Secondary Processes: Re- 
pression” (Freud, 1953, pp. 588-611). 
Freud returned briefly to the problem 
from time to time thereafter, the most 
important single paper being“ Formu- 
lations on the Two Principles of 
Mental Functioning" (Freud, 1958). 
Later papers helped to coordinate 
the two processes with later develop- 
ments in the theory, such as the new 
"death instinct" (Freud, 1955) and 
the new id, ego, superego structures 
(Freud, 1961). 

The most painstaking effort to 
understand what Freud meant and to 
cast what he said into the form of 
conceptual models was made by 
Rapaport in a series of papers 
(Rapaport, 1950, 1951a, 1951b, 1957, 
1959, 1960; Rapaport & Gill, 1959), 
all of which bear in one way or an- 
other on the problems of motivation 
in thinking. The main conclusion to 
which Rapaport came is that there 
are two kinds of organization € 
memory which become reflected in 
the two kinds of thinking: drive- 
organization? and conceptual-organt- 
zation, the former representing, 
course, primary process, the latter 
secondary process. 

In carrying through the conceptual 


2 [t is not possible in a brief paper to deal 
with all the puzzling problems that are raise 
in trying to be at once appreciative and crit- 
ical of psychoanalysis. In accepting the drive 
concept from psychoanalysis, and inat- 
ing it with what most psychologists mean by 
drive, we overlook a rigidity within writers OR 
classical psychoanalysis who recognize on 
two drives (sex and aggression), despite the 
primitive nature of pain, hunger, thirst, tem- 
perature, contact, curiosity, manipulation, 
and the other candidates for inclusion 8$ 
drives. Freud did not settle the matter once 
and for all in 1920 when he proposed the dea 
instinct, which for his followers made aggres- 
sion a second drive along with sex. 
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distinction between primary and 
secondary process, Rapaport deduced 
from Freud primary models of action, 
cognition, and affect (indicating their 
characteristics when primary process 
is in control) and secondary models of 
each of these, when the delays of 
secondary process are introduced 
(Rapaport, 1959, pp. 71-78). The 
primary model of action is that 
familiar in the drive-reduction theory 
of motivation: aroused drive-tension, 
presence of the incentive and response 
to it (in psychoanalysis, sucking the 
mother's breast), followed by drive 
reduction. The primary model of 
cognition is aroused drive tension in 
the absence of the incentive, leading 
to hallucination of the incentive. 
Finally, the primary model of affect 
substitutes affect discharge for hallu- 
cination. Thus the hungry infant 
may scream instead of hallucinating 
the breast. All primary models indi- 
cate prompt response to the drive 
that reaches threshold intensity; all 
secondary models introduce delays. 
The secondary model of action intro- 
duces a derivative drive (similar again 
to learning theories that study the 
drive value of familiar paths and the 
secondary reinforcement value of sub- 
goals). The role of inhibition (in the 
absence of the goal-object) is also 
stressed; again something familiar in 
the learning-theorist discussion of 
frustration-induced drives (Amsel & 
Roussel, 1952; Marx, 1956). The 
secondary model of cognition substi- 
tutes for the hallucination of the 
object a search for it, i. e., ordered 
thinking. The secondary model of 
affect substitutes for massive affect 
discharge a lesser anticipatory dis- 
charge that serves instead as a signal; 
behavior may be released which de- 
fends against the more massive affect 
discharge. There are complexities 
* 
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within each of these models that this 
brief summary cannot deal , 

Some of the characteristics of the 
two processes which we need to ex- 
amine in relation to a general theory 
of thinking are the following: 

1. Primary process is earlier in 
time and more primitive than sec- 
ondary process. does not mean 
that it is ever outgrown, however, for 
primary process functioning is char- 
acteristic of the normal adult as well 
as the infant, e.g., in dreams. 

2. When the primary process 
sway, wishing ends in hallucinating; 
the infant is said to hallucinate the 
satisfaction of its internal needs when 
they cannot be gratified at once. 
Massive affective discharge is an 
alternative. 

3. Primary process is coordinated 
with the pleasure principle, secondary 
process with the reality principle. 

4. The pleasure principle "reigns 
unrestrictedly in the id” and the ego 
endeavors to substitute the reality 
principle. 

5. The 
primary and secondary processes 
differ, the characteristics of primary 
process being inferred largely from 
dreams. Thus the disregard for space 
and time and for ordinary logic is 
typical of primary process; the proc- 
esses that Freud called the dream- 
work are primary ones, especially 
condensation, displacement, and sym- 
bolization. 

6. Primary process involves mo- 
bile cathexis" and the manipulation 
of large quantities of energy; second- 
ary process involves “bound cathexis" 
and operates with small amounts of 
energy. The interaction between pri- 
mary and processes is 
conflictual, involving repression, de- 
fense, and the like. ; 

7. Primary process is compelling, 
peremptory; secondary thought ac- 


formal characteristics of 
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tivity (practical thought, rational 
thought) we can “take or leave" 
(Rapaport, 1959, p. 76). 

8. Primary process thinking in 
conscious subjects may be found 
"either out of strength or out of weak- 
ness" (Holt & Havel, 1960, p. 267). 
That is, primary process thinking 
may emerge out of ego weakness (as 
in a psychotic state) or because a 
person regresses to primary process 
thinking for fun or in order to open 
himself to creative ideas. This has 
come to be called "regression in the 
service of the ego" (Kris, 1952; 
Schafer, 1958). 

Here then is a rich store of ideas. 
For these ideas to become a part of 
general psychology we need, first, to 
understand the theory in its own 
terms, second, to criticize it, and 
eventually, to reconstruct it. The 
ultimate contribution of Freud does 
not rest on a decision whether he was 
right or wrong; eventually we want 
to know more than he knew, but if he 
helped to stimulate the search that 
will tribute enough to him. 


FREUDIAN CONCEPTIONS EXAMINED 


Let us pass quickly over some of 
the general ideas that in one form or 
another everyone finds acceptable. 
Some kind of genetic-developmental 
theory of thinking is acceptable, and 
a number of these have of course been 
proposed, such as those of Piaget 
(1955) and Werner (1948). The de- 
tails are a matter of some uncer- 
tainty, but there is probably some 
kind of continuous development 
rather than a saltatory or discon- 
tinuous transition from one stage to 
the next. The Freudian theory can be 
conceived in this continuous way, for 
primary process is never completely 
displaced by secondary process (Bur- 
stein, 1959). Freudian conceptions 
have been compared with those of 
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Piaget by Wolff (1960).* Also some 
kind of contrast between prelogical, 
concrete, impulse-driven thinking and 
more abstract, dispassionate, realistic 
thinking (both forms found in the 
adult), is acceptable. It is important 
here, however, to know just what we 
are talking about, and Freud rests his 
case on the dream as the prototype of 
primary process; this can be objected 
to either on the grounds that dream 
thinking is not a good representative 
of illogical and fallacious thinking 
(even though it manifests these char- 
acteristics), or that Freud gave a one- 
sided picture of what dream thinking 
was like. French, who believes that 
dreams are attempts at problem 
solving and are more orderly than 
Freud thought, has reanalyzed 
Freud’s Dora case in these terms 
(French, 1954, pp. 10-18). 

The most controversial features of 
the Freudian scheme, either because 
they are unclear, unproven, or dis- 
puted; seem to me to be: (a) the 
theory of the interplay between the 
pleasure principle and the reality 
principle, especially in the negative 
definition of pleasure as tension- 
reduction, and the separation between 
affect and cognition, as implied in the 
notion that affect discharge is an 
alternative to hallucination as 2 


3 While the tenor of Wolff's monograph is 
that the coordination of Piaget and Freud 
should be rather easy, he has given some pene 
trating analyses of their differences, particu- 
larly in the first stage of development, where 
the primary-secondary process distinction 1$ 
most cogent. Here he points out that accord- 
ing to Piaget the organism's fundamental ten- 
dency is to assimilate the environment to 
itself, while Freud's theory is that it tries to 
rid itself of all stimulation (Wolff, 1950 p. 
60). The development of ego-psycho v 
within psychoanalysis now makes it easier S 
the classical analyst to accept early intera 
tion with the environment, while not peus 
up any of his long-held views about in 
psychic processes. 
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means of primary tension reduction; 
(b) the conjecture that the infant 
hallucinates the absent incentive;and 
(c) the energetics involved in the 
contrast between primary and sec- 
ondary processes. Each of these de- 
serves some comment. 

he tension-reduction theory of 
motivation has come in for a number 
of attacks, and attention has grad- 
ually shifted from the negatives of 
tension relief to the positive role of 
incentives (e.g., Hilgard, 1956, pp. 
427-433; White, 1959). Freud's 
pleasure principle, while somewhat 
more complex than the typical moti- 
vational theory of the experimental 
students of learning, subjects Freud 
to the same kind of criticism, for ex- 
ample, for his neglect of joy and hope 
among the affects (e.g., Schachtel, 
1959, pp. 19-21). This issue is being 
fought out within general psychol- 
ogy, and it would not take too much 
doctoring to fit the Freudian theory 
to whatever the outcome is. 

A most original feature of Freud's 
theory is that the infant hallucinates 
the absent object. This is of course 
conjecture, based upon the predomi- 
nantly visual nature of dreams, but 
the conjecture occurs repeatedly in 
Freud's writings. It should be noted 
that this cannot be mental activity 
at its earliest, for the hallucination is 
a memory, and some theory of prior 
perception and recovery is implied. 
The pleasure principle may be con- 
ceived to operate before modes of 
thought have developed at all. At 
one point Freud used the illustration 
of the bird inside the egg, with the 
nutrients there to be had immedi- 
ately; the wish for nutrients cannot 
be distinguished from the avail- 
ability of nutrients (Freud, 1958). 
Through some further steps, made 
necessary in human development be- 
cause the object of gratification is not 
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always there, attention and memory 
develop, and, out of them, thought. 
This is the course of development in 
the direction of the reality principle, 
but one thought-activity is split off: 
that is fantasy making. This is then 
the primary process that persists in 
thinking after secondary process 
thinking has also developed. The 
fact that the fantasy does not ac- 
tually bring relief means that sec- 
ondary process thinking must de- 
velop almost simultaneously; we are 
probably dealing with a ratio of the 
two processes from the start, more 
primary process gradually giving way 
to more secondary process. 

How plausible is the conjecture 
that the infant hallucinates in re- 
sponse to its needs? Evidence would 
be hard to get, although working 
backwards by analogy from EEGs 
and eye movements in hallucinating 
adults we might be able to get some 
evidence; to my knowledge this does 
not exist. The truth is probably a 
metaphorical one, emphasizing the 
tendency of thought to move to the 
concrete, the specific, the pictorial, 
and attributing to the infant what is 
found in adult dreams and in the 
hallucinations of deprived adults(the 
mirage on the desert) and psychotics. 
The tendency for more primitive 
thought to take concrete forms is not 
without support in experimental 
studies, for example, the concrete- 
abstract distinction of Goldstein and 
Scheerer (1941), and the greater 
ease of attaining concrete over ab- 
stract concepts in general, for ex- 
ample, Heidbreder, Bensley, and Ivy 
(1948), and Grant (1951). Freud ap- 
parently was not completely satisfied 
with his treatment of hallucinations; 
at one point he suggested that the 
negative hallucination (i.e., denying 
the presence of stimulation) might be 
a better point of departure than the 
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positive hallucination from which to 
start an explanation (Freud, 1957). 
The energy concepts within Freud- 
ian psychology are difficult ones at 
best and pose a number of problems 
(Colby, 1955; Hilgard, 1962). The 
term cathexis in Freudian theory is 
used for some kind of energy charge, 
but the analogy with physical energy 
is not a close one; the meaning is 
much more that of interest, or atten- 
tion, or of Lewin's valence. In any 
case a highly cathected idea comes to 
awareness (i.e., can be attended to) in 
competition with less cathected ones 
and can be driven out of awareness 
by countercathexes. The notion of 
mobile cathexis, used in discussing 
primary process, is that, as Holt and 
Havel (1960) put it, "an idea and 
its cathexis are easily parted'—the 
search pattern or drive that can 
cause one idea to be cathected may 
as well cathect another one. Hence 
one idea easily substitutes for another 
in a dream. An idea and its cathexis 
are more closely bound when sec- 
ondary process operates: when one 
idea is searched for in memory, or 
somehow comes across the threshold 
because of the state of its cathexis in 
relation to competing ideas, it comes 
in stable, reliable form. Poetry tends 
to deal in more mobile cathexes than 
science does (“to take up arms 
against a sea of troubles” versus Sea 
water contains sodium chloride"). 
Dealing with the distinction between 
primary and secondary process in 
terms of cathexes is metaphorical, 
but it communicates something that 
is comprehensible; still one is never 
sure but what he is missing some- 
thing.“ In addition to mobile and 


‘Obscure ideas sometimes seem less ob- 
scure to those who use them simply because 
they become familiar. Cathexis is, in fact, a 
very obscure idea; as in the case of other ob- 
scure ideas it becomes a difficult problem to 
determine when such an idea is merely ob- 
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bound energy there is neutralized 
energy (Freud, 1961; Kris, 1950), re. 
ferred to as delibidinized, deaggres- 
sivized, or sublimated. These forms 
are all said to have their roots in the 
innate drives (sex and aggression) 
but have been transformed from pri- 
mary process so as to be at the serv- 
ice of secondary process; there may 
also be forms of neutralized energy 
that do not come from drives (Hart- 
mann, 1950). Once neutralized 
energy is accepted the dichotomy be- 
tween primary and secondary proc- 
esses becomes less sharp (Rapaport, 
1959, p. 92). 

Another problem of energy in pri- 
mary and secondary processes has to 
do with amounts, large amounts being 
involved in primary process, small 
amounts in secondary process. This 
is a little confusing because in phys- 
ical outcome primary process tends 
to go on while the person is immobi- 
lized in sleep and incapable of putting 
out much energy; secondary process 
permits the physical outcome of ener- 
getic control over the environment. 
It is necessary to be repeatedly re- 
minded that we are talking about 
amounts of psychic energy and not 
physical energy. Actually the mat- 
ter has not been stated quite properly 
here: in primary process the quan- 
tity of energy dealt with is large be- 
cause it is mobile and all discharged 
at once. This is what gives primary 
process manifestations their insistent 
quality; they, so to speak, “take 
over." The total quantity of energy 
dealt with in secondary processes 


scure and when it is also profound. Attempts 
to clarify the concept have thus far not pos 
very helpful (e.g., Rapaport, 1959, pp. 12 ; 
129). There is no doubt that the notion d 
cathexis attempts to deal with deep psycho- 
logical problems, e.g., how the registration 
a past experience stored in the nervous sys 
tem becomes available to consciousness, how 
symbolization occurs. The question 1S iow 
well it solves these problems. 
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may be the same, but its regulation is 
through small quantities of energy, 
just as a small thermostat may con- 
trol a large heating plant. Hence 
secondary process is more finely 
tuned and can be turned on and off 
as primary process (usually) cannot 
be. 

In order to take these ideas out of 
their metaphorical context and place 
them nearer to general psychology, 
we can look for some resemblances to 
familiar ideas: 

1. Free association is more like 
primary process than controlled as- 
sociation because in controlled as- 
sociation we insist on bound cathexes, 
that is, on "appropriate" replies, as 
when we ask for a part-whole rela- 
tionship, or a large-small relation- 
ship, and then give one member of a 
pair and ask for an associate. In free 
association, anything will do, so long 
as an answer is given. Under these 
circumstances (and this is where 
Freud comes in) unconscious factors 
are likely to provide the missing in- 
termediaries between stimulus and 
response. 

2. Some persistent ideas (as in ob- 
sessions) have about them a driven 
quality, as though we are helpless 
about them; they seem to happen 
from without, as though they happen 
to us rather than by us. Thus we do 
not feel ourselves to be the stage 
managers of our dreams. This is 
what is meant by the immediate and 
powerful discharge of primary proc- 
esses. 

3. We sometimes distinguish the 
affective consequences of punishment 
from the informative consequences. 
Too much affect may produce what 
Thorndike called irrelevant emotion; 
according to the Yerkes-Dodson law 
too much punishment interferes with 
learning. Thus the massive involve- 
ment of affect is inhibiting to realistic 
cognitive activity; if the affect comes 
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in smaller doses, then the organism 
can profit by it in learning its way 
around. Here we have a clue to 
Freud's notion that secondary proc- 
ess experiments with small amounts 
of energy. The notion is also related 
to modern information theory, which 
distinguishes between the control 
mechanisms and the power 

tions that are controlled. Ra 

has noted this possible parallel ( 
aport, 1959, p. 91). 

4. The opposition between pri- 
mary and secondary processes 
tempered somewhat in the notion of 
regression in the service of the ego 
earlier referred to; it is a regression 
from which one can escape, so that it 
does not have the full peremptory 
quality usually assigned to primary 
process. That is, we can go toa “kid 
party” and then change our clothes 
and become adult; we are not com- 
mitted to hebephrenia by this act of 
temporary regression, The original 
discussion of regression in the service 
of the ego (Kris, 1952) is a very 
sketchy one; the best elaboration is 
by Schafer (1958). There is a curious 
quality about Schafer’s account, 
however. He gives 2 E tions 
facilitating regression in service 
of the ego; these are all conditions of 
good mental health or ego 4 
and as he reviews them himself he 
sees that they are not quite appropri- 
ate to gifted artists, comics, and sci- 
entists (who are supposed to use 

ion in the service of the ego 
unusually well). He resolves this 
problem by indicating that such 
regressions may serve different indi- 
vidual purposes. The trouble is prob- 
ably not with his account but with 
the concept itself. Probably more is 
involved than that a i me 
mits primary process thoug 
appear. One might think of several 
possibilities, such as (a) a capacity 
for regressive experiences, for ex- 


484 


ample, richness of imagination; (b) a 
tolerance for regressive experiences, 
for example, lack of anxiety when 
thought and imagination are given 
free range; and (c) skill in the utiliza- 
tion of regressive experiences, for ex- 
ample, ability to convert fantasy into 
acceptable artistic or other creative 
products, including humor.’ These, 
or other aspects, may mean that the 
experience called regression in the 
service of the ego has several dimen- 
sions. Schachtel (1959, pp. 244-248) 
objects to the notion that the experi- 
ences are regressive at all; a certain 
openness to new ideas need not be 
regressive, but is better interpreted, 
he believes, as progressive. 

When all the trappings of the 
theory of primary and secondary 
processes are removed there remains 
much in the major distinction that is 
plausible and familiar: enough to in- 
vite an examination of the more ob- 
scure conceptions. 


SOME QUESTIONS SUBJECT 
TO ANSWER 


Let us now grant that as reference- 
concepts the primary and secondary 
processes are useful, and see how we 
can go on from there, outside the 
special framework of the Freudian 
metapsychology. The basic classifi- 
cation, following David Rapaport, is 
between drive orgamized and concept 
organized memories as they enter in- 
to our thought processes. If pri- 
mary process rules out thinking, the 
vehicles of thought, the ideas to 
which we can attend, are brought to 
awareness by the impulses or drives 
that are stirred up; thus our mem- 
ories are drive organized. If my reac- 
tions to my boss are dictated by an 
unperceived relation between him 


5 Some of these distinctions have been made 
by As, O'Hara, and Munger (1962) in at- 
tempting to discover regression-like experi- 
ences related to hypnotic susceptibility. 
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and my father, then my thoughts of 
the boss are drive organized. If sec- 
ondary process rules my thinking, 
then I may use what I have learned 
from interacting with my father, but 
I know my boss is not my father, and 
I react to him in accordance with the 
demands of the actual social situa- 
tion. In this case, my thought is con- 
cept organized, according to the lines 
of command within the organization 
in which I work, the assignment I am 
working on, and so on. We have long 
been taught to distinguish between 
reasoning and rationalization; the 
former representing thought under 
the conceptual mode, the latter 
thought that is impulse driven. 

If we grant the distinction be- 
tween primary and secondary proc- 
ess, or drive organized and concept 
organized thought, then we have to 
decide how we are to use this distinc- 
tion in talking about the wide range 
of things people do when they think. 
There are two chief ways of using a 
twofold scheme of this kind, one asa 
dimension, the other as a mixture. 

The dimensional scheme takes off 
from the notion of growth, and as- 
sumes that primary process is primi- 
tive and early, secondary process 
more mature and later. One can then 
draw a line with primary process at 
one end and secondary process at the 
other, and place any act of thought 
along this line. The thoughts that 
are represented in the middle are 
fusions, if you wish, with some 
aspects of primary process and some 
aspects of secondary. I suppose one 
could go to a modern art exhibit and 
place the pictures along such a con- 
tinuum, with the totally nonrepre- 
sentative pictures at one end, corre- 
sponding to impulse, with photo- 
graphic representations of reality at 
the other; those in between would be 
the kinds of distorted or stylized pic- 
tures that combine impulse Wit 
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reality. This scale would be a kind of 
analog of a scale from primary to sec- 
ondary process. The dimensional 
position is the one favored by 
Rapaport (1951a, 1951b), Hartmann 
(1950), Kris (1952), Holt and Havel 
(1960). 

The mixture scheme suggests that 
primary process and secondary proc- 
ess remain to some extent distinct, 
but one intrudes upon the other; their 
conflicts are compromised in various 
ways, but there is characteristically 
enough vacillation between them to 
keep their identities intact. As one 
grows older a larger part of his 
thought tends to be of the secondary 
process kind, but he reverts to pri- 
mary process thinking in dreams and 
fantasy. 

These two ways of schematizing 
the relationship between primary and 
secondary process can only be dis- 
tinguished if the conceptual models 
are clear, for it is often hard to tell 
the difference between a fusion (im- 
plied in the dimensional scheme) and 
a mixture (implied when the two 
processes fight it out, but each con- 
tinues its own existence). 

These notions are too abstract to 
deal with unless we have some ex- 
amples before us. Let us consider 
some examples of thinking. 

1. A schizophrenic patient says to 
his physician: “I am 75 years old.” 
The physician says to him: “You feel 
that you have suffered three times as 
much as most 25-year-olds.” If the 
interpretation is correct, the patient 
has distorted reality, assigned him- 
self a false age, as an expression of 
affect. But in so doing he has multi- 
plied 25X3 correctly. The primary 
process interpretation is that the 
ideas that he manipulates come from 
his store of memories by way of im- 
pulse. He does not remember the 
age based on his birth certificate; he 
remembers the phenomenal time 
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through which he has suffered. The 
fact that he can manipulate these 
ideas correctly does not deny their 
primary process origin. 

2. A hypnotized subject is told 
that he is about to hear a very funny 
joke. The hypnotist tells him: “The 
whale is the largest living mammal.” 
He laughs as though his sides would 
split. Aroused from the hypnotic 
state he is asked why this was so 
funny. One subject says: “It really 
wasn't funny. I just had a sort of 
laughing fit." Another says: “You 
should have seen the funny whale I 
pictured with a long snout and tiny 
legs. It sure was funny!" In the first 
of these, impulse and cognition were 
not fused, in the second they were. 

3. A hypnotized subject is shown 
a small metal box with one real light 
on the left, but told that there are 
two lights, one on the left and one on 
the right. He sees both lights. Asked 
if they are both real, he says, „Ves.“ 
Told that one of them is not real, but 
to find which is which, he says: The 
one on the right is not real; it casts 
no reflection in the metal surface, as 
the one on the left does." If the 
hallucination signifies primary proc- 
ess, the successful problem solving is 
secondary process. Here both go on 
simultaneously, but they remain dis- 
tinct; the hallucination is not de- 
stroyed by the knowledge that it is 
not real. 

4. A subject who volunteers to be 
hypnotized for the first time by a 
technique in which gradual eye- 
closure is suggested, raises his arms 
before his chest, moans, and sobs. 
Roused from hypnosis, he can give no 
account of any ideas associated with 
the display of affect. In a later 
interview outside hypnosis childhood 
memories were reviewed, and he 
demonstrated how he cowered ina 
chair when he was beaten by his 
mother. He re-enacted in the inter- 
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view the positioning of his hands, his 
tightly closed eyes, his moaning and 
sobbing. His behavior in the hyp- 
notic situation can be interpreted as 
the reactivation of a memory (non- 
verbal reactivation in this case) on 
the basis of some similarities between 
the hypnotic induction and the 
earlier submission to authority. This 
memory was drive organized rather 
than concept organized; it did not, 
however, involve hallucinations. 

5. A subject who has just under- 
gone a hypnotic session without very 
much success, when leaving the ex- 
periment suddenly experiences a 
spontaneous regression: she finds that 
her body is shrinking and she is 
becoming a small-sized girl again. 
Somewhat írightened by this dis- 
torted body-image, she looks about 
her to see that the world of objects 
has not changed, and she becomes 
her own size again. She is able to 
switch the experience on and off. For 
a while her regressed body-image 
coexisted with a real world; it is an 
important principle that in a re- 
gressed state not everything is re- 


I have here given five illustrations 
to show what kinds of problems are 
to be faced in trying to assess pri- 
mary and secondary process think- 
ing, particularly in formulating them 
clearly enough to decide whether one 
should talk about fusions, or mix- 
tures, or both. 

Perhaps these illustrations them- 
selves suggest experimental prob- 
lems. I should like to suggest that 
more careful study of fantasy pro- 
ductions, eidetic images, and halluci- 
nations will make important contri- 
butions, provided these studies are 

guided by theory. Hypnotic experi- 
mentation, from which most of my 
illustrations were drawn, provides a 
convenient way of getting into these 
areas, but other methods are avail- 
able. Robert Holt and his associ- 
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ates, for example, have been study- 
ing primary process manifestations 
in Rorschach responses (Goldberger, 
1958; Goldberger & Holt, 1958; 
Holt, 1956; Holt & Havel, 1960), 
Presumably there should be more 
secondary process in the TAT, and 
this might be a good place to examine 
the problem of fusion versus mixture. 

Let me say a word about eidetic 
images. These have been very little 
studied in recent years, yet they can 
be detected when they are looked for. 
We find a good deal of evidence of 
their presence among our more hyp- 
notizable subjects. The subject who 
was told stories by an Irish grand- 
mother who believed in (and had 
actually seen) Leprechauns, has little 
trouble in seeing Leprechauns her- 
self, as eidetic images. These are 
now memory images from child- 
hood, but they bring a kind of grati- 
fication that is close to the original 
meaning of primary process, even 
though the gratification is derivative 
from the grandmother. The subjects 
in our sample who have these images 
tend also to be highly verbal and 
communicative, by contrast with the 
nonhypnotizable subjects who lack 
both fantasy and easy verbal expres- 
sion. One might suppose words to be 
representative of secondary process, 
but they are heavily loaded with pri- 
mary process too. Thus poetry, 8 
verbal art, uses many of the same de- 
vices as the dream. There are many 
problems here. 

A symposium is a good place to 
throw out problems for discussion, 
“even though answers are hard to 
'come by. Let me summarize some of 
the issues: 

1. Is it possible to sharpen the 
characterization of primary and sec- 
ondary process thinking so that the 
delineation will be clearer than it 
nowis? For example, when is halluci- 
nation an essential part of primary 
process thinking? 
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2. In dealing with any illustration 
of thinking that we wish to classify 
in primary-secondary process terms, 
do we do better to describe aspects of 
the thinking as primary and second- 
ary functioning, so as to place the il- 
lustration on a continuum, or do we 
describe the mixture and vacillation 
between the two processes? Or do 
we need a more complex model that 
encompasses both fusions and mix- 
tures? 

3. What kinds of experiments can 
we set up to help us sharpen these 
distinctions and bring them into line 
with our other ways of conceptualiz- 
ing thinking and problem solving? 
For example, Charles Fisher's (1960) 
perceptual experiments suggest the 
possibility that less clearly perceived 
(perhaps subliminal) material tends 
to be recovered in memory through 
drive organized memories, while more 
clearly perceived material tends to 
evoke concept organized memories. 
Here is certainly the kind of hy- 
pothesis that can be put to test, once 
our criteria of the two types of or- 
ganization are clearly formulated. 


SUMMARY 
1. Freudian psychology is in many 
respects a cognitive psychology, con- 
cerned as it is with hallucinations, 
dreams, memories, symbols, and dis- 


tortions of the thought process. It is 
at once a cognitive psychology and a 
psychology 


mary and secondary processes is a 
very central one within psychoana- 
lytic theory. The nature of these 
processes as described by Freud, and 
interpreted by Rapaport, is best sum- 
marized by asserting that there are 
drive organized and conceptually or- 
ganized memories that enter into the 
two kinds of thinking. 

3. Some of the problems of the 
Freudian theory are examined, and 
the plausibility of the theory is con- 
sidered in the light of other ap- 


raise questions about the two proc- 
esses, whether af particular example 


should be viewed as a fusion of the 


or as a mixture of them. 
The answer is not clear, and a com- 
plete model might have to include 
both fusions and mixtures, if the dis- 


ception and dreams. 
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VISUAL DEPTH DISCRIMINATION IN ANIMALS! 
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Visual depth discrimination may 
be defined as differential response to 
stimuli at different distances from 
the observer. Methods for eliciting 
such responses, and thereby assessing 
the discriminative capacities of vari- 
ous subhuman species, have been 
placed in five general classes. The 
terms "depth" and "distance" are 
used synonymously throughout, to 
refer to physical distance. 


SizE CONSTANCY STUDIES 


Retinal image size is an unreliable 
indicator of the depth of objects. For 
this reason, studies which attempt to 
demonstrate accurate depth discrimi- 
nation must show differential re- 
sponses to cues other than retinal 
image size. One way of demonstrat- 
ing the capacity of an animal to react 
discriminatively to depth cues other 
than retinal image size is by experi- 
mentally demonstrating "size con- 
stancy." Size constancy is typically 
obtained in the following way: first, 
the animal is trained to approach SD 
and not S^, where SP and S4 are two 
stimuli identical except for size, 
placed at an equal distance from the 
animal. Then the behavior of the 
animalis observed when the distances 
of the two stimuli are changed so as to 
invert the normal relation between 
distance and retinal image size. If the 
animal still approaches SP, it is clear 
that this response is under the control 
not simply of retinal image size, but 
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of some combination of other cues to 
depth. 

Herter (1930, 1953) trained a single 
carp (Carassius vulgaris) to swim to- 
ward the larger of two white circles 
(17 and 25 mm. in diameter) that 
were placed at a distance of 25 cm. 
from the subject. He then tested the 
animal in extinction under three con- 
ditions: (a) the stimuli were placed at 
25 and 50 cm., respectively, from the 
starting position, thus inverting the 
size-distance relation; (b) the left eye 


TABLE 1 


NuMBER OF RESPONSES TO THE Two 
STIMULI UNDER THE THREE CONDITIONS 


Stimulus 


Condition — 
Large Small 
Binocular, unequal distances 9 1 
Monocular, unequal distances 4 16 
Monocular, equal distances 9 1 


of the fish was removed surgically, 
and the stimuli remained at 25 and 50 
em.; and (c) the monocular fish was 
tested with both stimuli at 25 cm. 
from the starting position. 'The num- 
ber of responses made to the two 
stimuli under the three conditions is 
shown in Table 1. The animal re- 
sponded correctly (swam toward the 
larger stimulus) on most of the trials, 
except when viewing stimuli at un- 
equal distances with monocular re- 
gard, where 80% of the responses 
were incorrect. Herter concluded 
that the cues for depth discrimina- 
tion in the carp are binocular, since 
the animal exhibited size constancy 
only when allowed binocular vision. 
Meesters (1940) replicated Herter’s 
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procedure with the  three-spined 
stickleback and concluded that binoc- 
ular vision is necessary for depth dis- 
crimination in this species as in the 
carp. Gótz (1926) and Hertz (1928) 
studied depth discrimination in the 
chick and the bluejay, respectively, 
by training the birds to approach the 
larger of two stimuli and then testing 
for size constancy by placing the 
stimuli at unequal distances so that 
the size of the retinal image was no 
longer an accurate cue to depth. 
'Their animals, which were allowed 
binocular vision, responded correctly 
in the size constancy test, showing 
that chicks and bluejays discriminate 
depth. However, it is not possible on 
the basis of these two studies to 
specify any of the necessary cues for 
depth discrimination. Köhler (1915) 
observed the size constancy effect in 
chimpanzees responding to rectangles 
of different sizes; however, depth 
discrimination failed under monocular 
regard. 


INSTINCTIVE MOTOR RESPONSE 
STUDIES 


In addition to the size constancy 
studies described above, a second way 
of establishing that an animal has the 
capacity to discriminate depth is to 
observe the occurrence and accuracy 
of instinctive response patterns that 
require depth discrimination in order 
to occur. For example, Pumphrey 
(1948), Grinnell (1921), Hess (1950, 
1956, 1961), Benner (1938), and 
Bird (1926) have interpreted the 
pecking response in birds as an indi- 
cator of depth discrimination, be- 
cause the occurrence of pecking 
probably implies that the animal has 

discriminated the distance to the ob- 
ject pecked. Pumphrey and Grinnell 
noted that birds that seek stationary 
food always move their heads back 
and forth in a type of peering move- 


ment, fixating the object to be pecked 
before the pecking response occurs. 
This observation led Grinnell to 
speculate that motion parallax (rate 
of displacement of the image on the 
retina) is an important depth cue 
used by birds in localizing objects, 
since side-to-side peering necessarily 
gives rise to differential parallax as a 
function of the distance of the object. 

Benner (1938) placed photographs 
on the ground in front of some chicks. 
He found that photographs of peas 
with the shadows all pointing in the 
same direction would elicit a pecking 
response, but photographs in which 
the shadows pointed in diverse direc- 
tions would not elicit the response. 
Hess (1950, 1961) designed an experi- 
ment to test whether pecking re- 
sponses to light and shading depth 
cues are learned or unlearned in 
chicks, by raising two groups of chicks 
under different kinds of illumination, 
and noting differences in their subse- 
quent pecking behavior. The experi- 
mental group was raised in a home 
cage with a wire mesh floor and glass- 
bottomed feeding trays, in which the 
illumination was from below.“ The 
control group was raised in a similar 
cage, except that the ceiling was Wire 
mesh instead of the floor, and the illu- 
mination was from above. At age 7 
weeks, all chicks were taken from their 
home cage and placed one at a time 1n 
a diffusely lighted test cage. A photo- 
graph of some grains of corn was 
placed on a homogeneous flat surface 
in front of the chick. The photograph 
was divided vertically in equal halves, 
with no visible contour of division. In 
one half, the shadows cast by the 
grains on the surface pointed upward, 
and in the other half they pointed 
downward. Hess simply observed 
whether the chick pecked at the 
photograph and, if so, at which half. 
As expected, animals in the experi 
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mental group (home cage illumina- 
tion from below) pecked at the grains 
whose shadows pointed up, and ani- 
mals in the control group (home cage 
illumination from above) pecked at 
the grains whose shadows pointed 
down. In a second experiment, Hess 
repeated the above procedure, with 
two modifications: (a) all animals 
were tested weekly from ages 1 to 8 
weeks, and (b) the illumination in the 
test cage, no longer diffuse, always 
came from the opposite direction 
than it had in the cage in which the 
chick was reared. In this experiment 
the proportion of chicks that pecked 
differentially at the two halves of the 
photograph increased during the first 
60r 7 weeks. Hess concluded that the 
control of pecking by light and shade 
is acquired and that the direction of 
incidence of illumination determines 
for the young chick the nature and 
rate of development of the pecking 
response to these cues. 

Inanotherexperiment (Hess, 1956), 
chicks wore goggles with prismatic 
lenses arranged so that stimulus ob- 
ject images were displaced towards 
the subject. When a grain of corn was 
suspended in the air directly in front 
of a chick wearing these goggles, its 
pecks at the grain were observed to 
fall short in every case. This finding 
was the same for newly-hatched 
chicks, chicks 6-8 weeks old, and 
2-3 month old chicks who had been 
restricted to monocular vision since 
hatching. Hess concluded that chicks 
employ binocular cues in visual depth 
discrimination, and that the binocu- 
lar basis of the discrimination is not 
learned. 

Bird (1926) undertook an evalua- 
tion of the role of maturation in the 
development of the pecking response 
in chicks. Three groups of chicks at 
different mean ages were studied: an 
8-hour-old group of animals, and two 


30-hour groups. Animals in one of 
the 30-hour groups were "not al- 
lowed” any practice in pecking. It 
was found that chicks in the 8-hour 
group typically emitted slow, oscilla- 
tory pecks, while the 30-hour-old 
chicks showed rapid, definite pecks. 
Bird concluded that maturation is an 
important factor in the development 
of pecking behavior. 

There are several other studies of 
the pecking response in chicks that 
were designed to assess the effects of 
maturation and learning. Most of 
these studies follow the pattern of the 
Bird (1926) experiment described 
above. Their findings confound the 
effects of sensory and motor develop- 
ment and hence throw little light on 
the acquisition of discriminative ca- 
pacity. It is probably true that birds 
only peck when they can discriminate 
distance; however, when a young bird 
fails to peck it may be because it is 
unable to discriminate distance, but it 
may also be because its motor ca- 
pacities are insufficiently developed, 
while its discriminative capacities are 
in fact adequate and perhaps amen- 
able to measurement by some other 
means. In other words, the ability to 
discriminate depth may be a neces- 
sary but nota sufficient condition for 
the pecking response to occur. 

In an attempt to study the very 
early development of visual depth 
discrimination, Fishman and Tal- 
larico (1961) subjected prematurely 
hatched chicks to an approaching 
and receding optical stimulus (a ball- 
point pen) during the first 10 seconds 
of visual experience, and noted the 
reflex occurrence of blinking re- 
sponses. When the ball-point pen 
was brought rapidly to within less 
than an inch of the chicks' eyes, the 
number of blinks was not signifi- 
cantly greater than in a control group 
not exposed to the approaching 
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stimulus. In a further experiment 
(Fishman & Tallarico, 1961b), chicks 
aged 3 hours were tested in a similar 
fashion, using two types of stimulus: 
a rapidly approaching comb, and a 
pleated fan held at a fixed distance 
and extended in a plane perpendicu- 
lar to the chick's line of regard, so 
that the visual angle increased sud- 
denly. In this experiment the re- 
sponse was defined as a reflex move- 
ment of the head, away from the 
stimulus. It was found that the ap- 
proaching comb produced a greater 
number of responses than the ex- 
tended fan or the control condition 
(no stimulus), and that there. were 
no significant differences in the per- 
formance of light- and dark-reared 
chicks. Fishman and Tallarico re- 
garded these results as tentative evi- 
dence for the occurrence of innate 
visual depth discrimination in chicks 
at age 3 hours. They also concluded 
that a change in visual angle alone is 
insufficient for the discrimination to 
occur. 
Baldus (1926) studied depth dis- 
crimination in insects by observing 
food-getting responses of the larva of 
the dragonfly (Aeschna Cyanea) in 
order to make inferences about the 
basis of its depth discrimination. 
Larvae of this species remain motion- 
less with their front legs folded until 
prey is within reach (about 10 mm. 
for an animal 40 mm. long), and then 
abruptly unfold the legs and seize the 
prey. If the seizing response is ac- 
curate, the animal evidently dis- 
criminates the distance to the prey. 
Baldus tested larvae individually in a 
miniature aquarium, using stimuli 
mounted on the end of a wire and 
placed just under the surface of the 
water in the larva's field of vision. 
Flies, very small tadpoles, bits of tin- 
foil, and other small objects were used 
as stimuli; emission of the seizing re- 
sponse proved to be independent of 
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the form of the stimulus. One 

of subjects was allowed normal 
binocular vision; subjects in the other 
two groups were reduced to monocu« 
lar vision, in one group by surgically 
removing a single optic ganglion, and 
in the other by covering one eye with 
black paraffin. Insects in the group 
with binocular vision exhibited ac- 
curate seizing responses when the 
stimulus was moved back and forth 
in front of them, but insects in the 
two monocular groups responded, al- 
most regardless of distance, to stimuli 
subtending the same visual angle as 
that subtended by normal prey within 
reach. Baldus concluded that depth 
discrimination in this species is under 
the control of binocular stimulation. 
The paraffin group was used as a con- 
trol against the possibility that the 
single-ganglion preparations were 
showing a motor inhibition resulting 
from physiological damage rather 
than an inability to discriminate 
monocularly the distance to the stim- 
ulus. Scheuring (1921) observed the 
behavior of the codfish (Gadus virens), 
which in feeding on smaller fishes 
exhibits a propulsive thrust of just 
the force necessary to carry the ani- 
mal to its prey. He inferred from the 
accuracy of the response that codfish 
discriminate distance rather finely. 
No attempt was made to isolate cues 
or otherwise to study the conditions 
necessary for the attack response to 
occur. 

Plateau (1888) placed insects of 
several species individually in one 
end of a darkened box, with two 
lighted rectangular openings at the 
other end, and noted the opening 
through which each insect escaped. 
One opening was in a fixed position, 
and the other was mounted near the 
edge of a slowly revolving disc that 
formed part of the end-wall of the box. 
The experiment was designed to test 
the conjecture that the insects would 
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be more likely to respond to the mov- 
ing stimulus, and would escape 
through the revolving opening. This 
turned out not to be the case, when 
the two openings were of equal size 
and equally illuminated. However, 
Plateau did find that whenever one of 
the openings was either larger or more 
intensely illuminated, it was chosen 
more often regardless of whether it 
was the moving or the stationary 
opening. He speculated that size and 
brightness may have operated as 
illusory distance cues, and that the 
insects were responding to the ap- 
parently nearer of the two openings. 
At any rate it is clear that size and 
brightness, which normally serve as 
cues to depth in other species, 
were discriminated by these insects. 
Among the insects tested in Plateau's 
apparatus were the honeybee (Apis 
mellifera), the bumblebee (Bombus 
hortorum), the blue horse fly (Calli- 
phora vomitoria), and two varieties of 
butterfly (Pieris brassicae and Pieris 
napi). 
JuwrrNG STAND STUDIES 


When an animal leaves a jumping 
stand, it is possible to study its depth 
discriminative capacities by observ- 
ing events associated with the jump 
under various experimental condi- 
tions. The original study of this type 
was done by Richardson (1909), who 
trained rats to jump from one plat- 
form to another. She measured for 
each rat the number of trials until 
jumping was “facile,” and gave de- 
scriptions of the jumping behavior of 
the rats. She concluded that visual 
input is not sufficient for accurate 
depth discrimination in rats who have 
not had opportunity for learning. 

Russell (1932) devised a jumping 
stand which measured the horizontal 
force exerted by an animal leaving the 
stand. Using this apparatus, he dida 
thorough study of the conditions 


493 


necessary for rats to discriminate 
distance. First, he established that 
rats are capable of depth discrimina" 
tion, by measuring the horizontal 
force exerted by rats jumping to a 
tform whose distance was 

trials over a range of 20 to 
45 cm. from the stand. There was a 
black rectangle on the platform which 
served as a target. The data showed 
force and distance to be monotoni- 
cally related in the range of distances 
studied. Russell also observed a di- 
rect relation between distance and 
apparent disinclination to jump. 

Next, Russell measured the 

ence limen for distance, for both 
albino and pigmented rats, by finding 
the smallest difference in distance be- 
tween two platforms that always 
elicited jumps of different forces. 
The DL as determined by a modified 
method of constant stimuli was 2 
cm. for pigmented rats and 4-5 cm. 
for albinos. Finally, he eliminated 
cues to depth one at a time and also 
in certain combinations, in an at- 
tempt to discover the critical stimulus 
dimensions controlling the discrimi- 
nation. There was some evidence 
that binocular cues were important: 
the rats’ eyes seemed to converge 
before jumping, suggesting that they 
were utilizing bi convergence; 
and it is known that the rat's optic 
tract decussates only partially and 


and 


systematically varied; compensatory 
changes were also made in the dis- 
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tance between the platform and the 
jumping stand, so that the retinal 
image of the target was the same size 
on all trials. Although the retinal 
image remained the same, the force 
exerted by the rat in leaving the 
jumping stand increased as the dis- 
tance to the target increased. This 
finding shows that rats discriminate 
distance in the absence of size cues. 

Motion parallax may also have 
been an important cue in depth dis- 
crimination in the above experiments, 
since almost all the rats exhibited 
head-swinging or peering behavior 
before jumping. To test the crucial 
character of this cue, a homogeneous 
white ground was placed behind the 
platform to preclude any contours 
which might furnish differential paral- 

lax relative to the platform. Never- 
theless, the rats were able to discrimi- 
nate well, as measured by the force- 
distance relation. However, since 
animals in this series had been used 
in previous trials, practice effects 
were confounded with experimental 
treatments, so that it could notbe 
concluded unequivocally that the 
removal of motion parallax cues does 
not affect discrimination. The only 
other cue to be experimentally ma- 
nipulated was texture or aerial per- 
spective, which was degraded by 
lowering the level of illumination for 
a series of trials. Under this condi- 
tion, the discriminative behavior 
of the rats deteriorated noticeably. 
However, lowering the illumination 
may degrade other cues to depth 
besides aerial perspective; for ex- 
ample, light and shade. Thus, the 
decrease in discrimination may not 
have been due to the change in aerial 
perspective alone. 

When Russell (1932) eliminated 
depth cues in combination, he found 
that average absolute force in jump- 
ing was positively correlated with 


PAUL G. SHINKMAN 


the number of cues removed. That is, 
when several cues were removed, the 
ordinal distance-force relation was 
preserved, but the overall average 
force was greater, so that the rats 
tended to overjump. This result 
suggests that discriminative ability is 
to some extent a function of the 
number of cues present. In his con- 
cluding series, Russell placed the ap- 
paratus in the dark, with the target 
which had formerly been black now a 
lighted rectangular opening serving 
as the only source of illumination for 
the animal. Thus, all cues except 
retinal size were held constant as 
distance was varied. In this situation 
the rats were unable to discriminate 
distance (jumping force no longer 
correlated with target distance). 
This series of experiments led to 
the conclusion that rats can discrimi- 
nate distance in the absence of any 
single cue but that discrimination is 
more difficult when several cues are 
removed in combination.  Lashley 
and Russell (1934) later used the 
same apparatus in an experiment to 
determine whether rats' ability to 
discriminate distances is ‘innately 
organized." Each rat was raised in 
darkness to age 100 days; on Day. 101 
it was brought into the light and given 
a single trial in which the jumping 
stand and the platform were sepa- 
rated by 5 cm., followed by five trials 
at 20 cm. The next day it was given 
five trials at 20 cm., three at 40 cm., 
three more at 20 cm., and finally three 
more at 40 cm. The following day it 
was given nine trials: one trial each at 
distances of 24, 26... 40 cm. The 
results showed an overall ordinal 
correlation between force and dis- 
tance, and also an average DL which 
was not significantly different from 
that of Russell's animals; Lashley 
and Russell (1934) concluded that 
“the visual perception of distance 
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and gradation of force in jumping to 
compensate for distance are not 
acquired by learning...." Ina 
further experiment using the same 
procedure, Lashley (1937) found that 
lesions in the optic thalamus or collic- 
uli did not significantly impair depth 
discrimination in rats. 

Greenhut and Young (1953) re- 
peated the experiment of Russell 
(1932), and found that the force with 
which rats left the jumping stand did 
not correlate very highly with the 
distance to the target platform, when 
the distance was varied randomly 
between trials. They obtained a sub- 
stantial force-distance correlation 
only when the distance to the target 
platform was varied systematically 
on successive trials; i.e., gradually 
increased or gradually decreased. On 
the basis of this finding, these authors 
suggested that serial order effects 
may have accounted in part for 
Russell’s results. Greenhut and Young 
further noted that the horizontal 
force exerted by rats leaving the 
jumping stand was not always an 
accurate indicator of the actual dis- 
tance traversed by the animal. They 
proposed a weighted combination of 
horizontal and vertical forces as a 
more valid dependent variable for 
future studies employing the jump- 
ing stand method. In a subsequent 
study, Greenhut (1954) showed that 
rats discriminate stimulus objects at 
different distances in a two-choice 
alley runway. 

Wallace (1959) performed a series 
of experiments to study depth dis- 
crimination in the desert locust 
(Schistocerca Gregaria forskàl). While 
in its natural habitat this locust very 
frequently swings its head and upper 
thorax back and forth from one side 
to the other, in what appears to be a 
visual scanning or orienting response. 
The purpose of Wallace’s study was 
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to determine whether this scanning 
behavior is responsible for depth 
discrimination. He used a miniature 
jumping stand and two platforms 
each with a black rectangular target. 
Pretests showed that the insect will 
always jump to the nearer of the two 
platforms. In the first experiment 
the sizes of the rectangles were varied 
along with their distance so that both 
rectangles had the same proximal 
image size. In the second experiment 
the nearer rectangle always cast a 
smaller image. Under both of these 
conditions the locusts almost always 
jumped to the nearer platform. 
Wallace concluded that the necessary 
cues for depth discrimination must 
come either from binocular vision or 
from the differential parallax result- 
ing from the locust's scanning re- 
sponses (a monocular cue). To ex- 
amine the first possibility Wallace 
replicated his experiments using in- 
sects rendered blind in one eye. 
These insects also jumped to the 
nearer platform, showing that binocu- 
lar vision is not a necessary condition 
for depth discrimination in this 
species. Wallace concluded by ex- 
clusion that the basis for the dis- 
crimination is the scanning response 
which gives rise to differential motion 
parallax. Further support for the 
critical role of the scanning response 
was furnished by a fourth experiment 
employing a single target which 
moved slowly from side to side. It 
was found that insects jumping at a 
moment when the target was moving 
in a direction opposite to the scanning 
movement tended to jump short, 
whereas insects jumping at a moment 
when the head and target were mov- 
ing in the same direction tended to 
overjump. This extremely interesting 
result is exactly what would be ex- 
pected if the locust's distance dis- 
crimination were based on the rate at 
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which an image travels across the 
compound eye. 

Forel (1910) has suggested that 
sharpness of object contour may 
operate as a cue to distance for in- 
sects, whose lenses are rigid and have 
a fixed focus, since the amount of blur 
of an outline on the insect's retinae 
must have a direct relation to the 
distance of the object. This possibil- 
ity was not mentioned by Wallace, 
and might account for the results in 
his first three experiments. However, 
it does not account for his findings 
with moving targets. 


Puvsicar CLIFF STUDIES 


Depth discriminations may be ob- 
served in animals by recording cer- 
tain properties of their behavior 
when confronted with a cliff or preci- 
pice. However, visual, somesthetic, 
and kinesthetic sensory inputs may 
be confounded in studies employing 
this method. 

Yerkes (1904) placed tortoises in- 
dividually on a 30X60 cm. board at 
various heights (30, 90, and 180 cm.) 
above a net, and measured the inter- 
val between the time the animal was 
placed on the board and the time it 
fell off the board and into the net. He 
observed three species of tortoise on 
this apparatus; a water tortoise 
(Chrysemys picta Schneider), an am- 
phibious tortoise (Nanemys guttata 
Schneider), and a land tortoise (Ter- 
rapene Carolina Linnaeus). Four 
tortoises from each species were each 
given one trial a day for 10 days. The 
average time per trial (in minutes) 
spent on the board by animals of the 
three species at the three heights is 
shown in Table 2. The total number 
of animals among the three species 
that stayed on the board for 60 min- 
utes at the three heights is shown in 
Table 3. Both of these sets of data 
show clear main effects for height 
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and for species. Yerkes (1904) inter- 
preted the height effect as, evidence 
that vision plays a part in determin- 
ing the responses, and the species 
effect as a reflection of differences in 
the evolutionary importance of depth 


discrimination for animals of the 
same family who live in different 
TABLE 2 


AVERAGE MINUTES PER TRIAL SPENT ON 
THE BOARD nv THREE SPECIES OF 
TORTOISES AT THREE HEIGHTS 


Species 
Height 
Water 


Amphibious 


TABLE 3 
NUMBER OF TORTOISES OF THREE SPECIES 
Wao Spent 60 MINUTES ON THE 
BOARD AT THREE HEIGHTS 
TTT 


Species 


Height | ———— . 
T Water | Amphibious | Land 
30 cm. 
90 cm. 
180 cm. 


environments. Evolutionary differ- 
ences may have been confounded, 
however, with differences in the 
amount of experience land and water 
tortoises had had in situations 
resembling Yerkes’ physical cliff. 
Thorndike (1899) made similar ob- 
servations on chicks aged 95 hours. 
He placed them individually on plat- 
forms at varying heights above the 
table, and noted when they jumpe 
to the table. He reported that chicks 
always jump quickly from heights 
between 1 and 10 inches, hesitate à 
long time before jumping from a 
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height of 22 inches, and almost never 
jump from a height of 39 inches. He 
concluded that chicks discriminate 
depth innately; however, it is not 
clear to what extent he controlled the 
visual experience of his subjects. 
In a similar experiment, Kurke (1955) 
gave domestic chicks 10 days of “en- 
forced vertical experience" in a spe- 
cial brooder containing a raised plat- 
form. When tested on another plat- 
form whose height varied, these 
chicks jumped down to the table from 
a greater average height than chicks 
given only "restricted vertical ex- 
perience.” Waugh (1910) placed mice 
individually on a small disc attached 
perpendicularly toa movable vertical 
rod passing through a hole in the 
table. He then raised the disc to vari- 
ous heights above the table and ob- 
served how much time elapsed before 
the mouse jumped down. A mild 
electric shock was used to induce the 
animal to jump. Waugh found a 
positive correlation between the time 
to jump and the distance to the table, 
and inferred that mice discriminate 
depths up to 18 cm., which was the 
greatest height he used. It is ap- 

nt, however, that he left uncon- 
trolled the kinesthetic and vestibular 
sensory inputs afforded the animal 
during the ascent of the disc. 


Visuat CLIFF STUDIES 


A principal defect in the cliff stud- 
ies cited above is the confounding of 
visual with nonvisual stimuli. Re- 
cently, studies of a similar nature 
have been done using a simple ap- 
paratus which eliminates all non- 
visual cues, thus enabling the experi- 
menter to obtain discriminations 
based solely on visual stimuli. The 
apparatus, known as the visual cliff 
(Gibson & Walk, 1960), consists of a 
runway board with a large sheet of 
clear glass directly underneath it, 


box at about 2 feet above the bottom, 
The bottom of the box is covered with 
some pattern, for example a red and 
white checkered pattern. Paper bear- 
ing the same pattern covers the un- 
derside of the glass sheet on one side 
oí the runway, which is called the 
shallow side since the optic pattern is 
almost level with the runway. The 
other side is called the deep side, 
since the patterned surface is about 
2 feet below the runway. Animals 
are tested for depth discrimination 
by placing them on the runway and 
noting to which side they go if they 
leave the runway. Latencies and 
number of times the runway is 
crossed may also be measured. When 
animals consistently choose one side 
over the other, they are certainly 
responding differentially to whatever 
visual depth cues the pattern affords. 
Clearly, two classes of experimental 
iables may be manipulated in 


the optic relation 
fields adjoining the runway, and the 
other is the past history of visual 
stimulation of the animals who are 


tested. 

Gibson and Walk (1960, 1961) and 
Walk and Gibson (1961) tested 
chicks, turtles, rats, lambs, goats, 
pigs, kittens, and dogs, using a visual 
cliff with a red and white 
pattern, and found that all these 
species reliably chose the shallow 
side of the visual cliff. Chicks and 
goats always went to the shallow side 
at age 24 hours, and cats did the same 
at age 4 weeks. Cats placed by hand 
on the deep side showed freezing be- 
havior, which did not extinguish even 
after many trials. Tallarico (1961) 
observed a tendency for chicks to 
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choose the shallow side as early as 
3 hours after hatching. 

In a series of control experiments, 
Gibson and Walk (1960) covered the 
underside of the glass in the visual 
cliff with plain gray paper on both 
sides of the runway. In this situa- 
tion, the proportion of rats leaving 
the runway on either side was not 
significantly different from one-half. 
As an additional control, tests were 
made with no paper under the glass 
on either side, so that both sides 
were deep. Rats in this series simply 
failed to leave the runway. 

Two major cues were presumed to 
determine side preference: relative 
motion parallax (rate of displacement 
on the retina of the images of the pat- 
tern elements on each side as the ani- 
mal moves along the runway) and 
the size of the retinal images of the 
pattern elements. Gibson and Walk 
next endeavored to isolate these cues 
and thus to assess their importance. 
Differential image size was eliminated 
by making the pattern elements on 
the shallow side smaller, so that the 

pattern elements on the two sides 
gave rise to retinal images which were 
equal in size. With motion parallax 
presumably the only remaining differ- 
ential depth cue, day-old chicks and 
adult rats almost invariably chose the 
shallow side. It will be noticed that 
other cues to depth remained besides 
the supposedly isolated motion paral- 
lax (e.g., differential accommodation). 
Next, differential motion parallax was 
eliminated by covering the underside 
of the glass on both sides of the run- 
way with patterned paper containing 
elements somewhat smaller on one 
side than on the other, thus isolating 
size as a depth cue. In a series of 
trials under this condition, day-old 
chicks chose the two sides about 
equally often; adult rats tended to 
choose the side with larger pattern 
elements. 
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Walk, Gibson, and Tighe (1957) 
tested 90-day-old light- and dark- 
reared rats on the visual cliff, and 
found that animals from both groups 
reliably chose the shallow side, sup- 
porting the interpretation that depth 
discrimination in rats can occur with- 
out prior learning. Nealey and 
Edwards (1960) noted that the dark- 
reared animals in the Walk et al. 
(1957) experiment had been allowed 
a 20-minute period to adapt to the 
light before being tested. These 
authors performed an experiment de- 
signed to control for the possibility 
that very rapid learning, analogous to 
imprinting, had occurred during that 
time. They replicated the Walk et al. 
procedure, obtaining the same results. 
Two additional control groups were 
tested; one was a 90-day dark-reared 
group, for whom the 20 minutes of 
light adaptation took place under 
individual translucent hoods, thus 
excluding any opportunity for pat- 
tern-vision. Seventy-five percent of 
the rats in this group chose the shal- 
low side of the visual cliff; this finding 
supports the Walk et al. conclusion 
that depth discrimination occurs 1n- 
dependently of learning. In the 
second control group, the rats were 
surgically blinded, to control for any 
nonvisual cues that might have been 
present. These rats did not discrimi- 
nate between the shallow and deep 
sides. 

Nealeys also replicated the Walk 
et al. (1957) study, but with dark- 
and light-reared rats 10 months old, 
and a pattern consisting of only three 
wide lines on either side parallel to 
the runway. These lines all sub- 
tended the same visual angle when 
viewed from the runway. He found 
that the light-reared rats chose the 
shallow side, and the dark-reared rats 


3S. M. Nealey, personal communication, 
1961. 
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TABLE 4 


SUMMARY OF CLASSES AND METHODS USED IN EXPERIMENTS ON SUBHUMAN VISUAL 
DEPTH DISCRIMINATION 


Method 
Insects Fish 
Size constancy Wallace | Herter 
Meesters 


Scheuring 


Instinctive response | Plateau 
Baldus 


Jumping stand Wallace 


Physical cliff 


Es 


Visual cliff 


failed to make the discrimination. 
The same dark-reared rats were tested 
immediately thereafter, using the 
original Walk et al. stimulus pattern, 
and still did not discriminate. This 
suggests that the rats raised in the 
dark for 10 months may have suffered 
impaired vision. The impairment was 
apparently temporary, however, be- 
cause when the rats were tested after 
a month in the light, they consist- 
ently chose the shallow side. Ina sub- 
sequent experiment, Nealey* tested 
the hypothesis that the Walk et al. 
rats discriminated on the basis of 
sharpness of focus. He projected 


Class 


Reptiles Birds Mammals 
Götz Köhler 
Hertz 
Benner 
Bird 
Fishman & 
Tallarico 
Grinnell 
Hess 
Pumphrey 
EE 
Greenhut & 
Young 
Lashley 
Lashley & 
Russell 
Richardson 
Russell 
fae — 
Kurke Waugh 
Thorndike 
— . — N 
Gibson & Gibson & Walk 
Walk Nealey 
Walk & Nealey & 
Gibson Edwards 
Tallarico 
Walk & Gibson 
Walk, Gibson, 
& Tighe 


sharp and blurred patterns of equal 
size onto the two sides of a ground 
glass screen which was in the position 
originally occupied by the clear glass 
sheet in the visual cliff. It was found 
that light-reared rats descended 
equally often on both sides, thus rul- 
ing out focus as the critical cue con- 
trolling rats' behavior on the visual 


cliff. 
CONCLUSION 


Table 4 summarizes the research 
reviewed in this paper, showing what 
classes of animals were studied by the 
different methods. The capacity to 
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react discriminatively to the distance 
of a visual stimulus appears to char- 
acterize a great many species, ranging 
from insects to primates. Especially 
in the case of insects, birds, and rats, 
it is evident that displacement of the 
images on the retinal mosaic is a very 
important factor in depth discrimina- 
tion. Wallace (1959) demonstrated 
with the desert locust the importance 
of visual scanning behavior (which 
necessarily gives rise to motion paral- 
lax), and Baldus (1926) found that 
the dragonfly larva responds to mov- 


PAUL G. SHINKMAN 


ing stimuli, but never to stationary 
stimuli. Pumphrey (1948) and 
Grinnell (1921) noted the occurrence 
of scanning responses in birds in the 
natural environment, and Gibson and 
Walk (1960) showed the critical im- 
portance of motion parallax in depth 
discrimination by chicks, performing 
under more controlled conditions. 
These findings support the general 
statement that stimulus image dis- 
placement and the resulting motion 
parallax are intimately linked with 
depth discrimination. 
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On the basis of her study of trans- 
position in the two-stimulus problem 
in young children, Kuenne (1946) 
proposed that 
there are at least two developmental stages so 
far as the relation of verbal responses to overt 
choice behavior is concerned. In the first, the 
child is able to make differential verbal re- 
sponses to appropriate aspects of the situa- 
tion, but this verbalization does not control or 
influence his overt choice behavior. Later, 
such verbalizations gain control and dominate 
choice behavior (p. 488). 


Similarly, Kendler, Kendler, and 
Wells (1960) suggested in a study of 
reversal and nonreversal shifts in 
discrimination learning in preschool 
subjects (Ss) that 

there is a stage in human development in 
which verbal responses, though available, do 


not readily mediate between external stimuli 
and overt responses (p. 87) 


and noted that Luria (1957) had 
reached essentially the same conclu- 
sion. This suggestion that there is a 
stage of development in which verbal 
responses do not serve as mediators 
is designated the ‘‘mediational de- 
ficiency hypothesis" in the present 
paper. The evidence regarding this 
hypothesis is reviewed in the follow- 
ing sections. 


Reversal and Nonreversal 


Kendler and Kendler (1959) rea- 
soned that if verbal mediation occurs, 
performance on a reversal shift (in 
which the previously positive stimu- 
lus becomes negative; and the previ- 
ously negative stimulus, positive) 
should be superior to that on a non- 
reversal shift (in which the previ- 
ously relevant dimension becomes 
irrelevant; and the previously ir- 
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relevant dimension, relevant). In the 
reversal shift, appropriate verbal 
labels developed during the original 
discrimination remain relevant and 
facilitate learning; whereas in the 
nonreversal shift, new labels must be 
acquired and the old labels, developed 
in the original discrimination, inter- 
fere with learning. The verbal labels 
function like the "internal orienting 
responses” of Goodwin and Lawrence 

(1955), which involve the identifica- 
tion of and reaction to" the relevant 
dimensions. 

Kendler and Kendler (1959) re- 
ported that kindergarten Ss who 
learned the initial discrimination 
rapidly, compared with those who 
learned it more slowly, were superior 
on the reversal condition and slightly 
inferior on the nonreversal shift. 
They concluded that the Ss were in 
the process of developing mediating 
responses relevant to the task, and 
that some Ss were further along than 
others, since the fast initial learners 
responded as though they were medi- 
ating, and the slow initial learners 
responded as though mediation did 
not occur. Kendler, Kendler, and 
Wells (1960) found that preschool Ss 
performed like the slow-learning 
kindergarten Ss, and that instruc: 
tions to verbalize their choices on the 
last 10 trials of the initial discrimina- 
tion had no effect on reversal or non- 
reversal shifts, leading to their state- 
ment of the mediational deficiency 
hypothesis. 

0 Connor and Hermelin (1959) 
found that imbeciles learned a te 
versal faster than normal preschool 
Ss except when the imbeciles were 
required to verbalize their choices on 
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the initial discrimination. Verbaliza- 
tion interferred with reversal by 
imbeciles. The imbeciles in the ver- 
balization group and the normal pre- 
school Ss performed in essentially 
the same way as those of Kendler 
et al. (1960). O'Connor and Hermelin 
interpreted their results as indicating 
that the use of verbal labels inter- 
feres with reversal shifts, because the 
S must inhibit not only the associa- 
tion between the overt choice re- 
sponse and the previously positive 
stimulus, but also the association 
between the choice response and the 
name of the previously positive stimu- 
lus. This interpretation does not 
conflict with that of Kendler and 
Kendler (1959) if it is assumed that 
in normal Ss the mediator is an ori- 
enting response, usually verbally 
directed, which involves identifica- 
tion of and reaction to the appropriate 
dimension, and that in imbeciles it is 
a verbal response functionally equiva- 
lent to a nonsense-syllable name (as 
in studies of acquired distinctiveness 
of cues). Kendler and Kendler as- 
sumed that S names both stimuli as 
members of a single dimension, and 
O'Connor and Hermelin assumed 
that the imbecile names only one 
stimulus. It should be emphasized 
that the performance of O'Connor 
and Hermelin's normal preschool Ss 
supports the mediational deficiency 
hypothesis, since it is in line with the 
findings of Kendler et al. (1959, 
1960). 


Discrimination Set 


In the preceding section it was 
assumed that the mediator in re- 
versal shifts is a verbal orienting re- 
sponse. Two studies of discrimina- 
tion set (orienting responses, identifi- 
cation of relevant dimensions, etc.) 
provide somewhat conflicting evi- 
dence regarding the assumption. 
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Weiss (1954) found that set-inducing 
instructions (informing Ss that the 
reward was always behind the same 
stimulus in a discrimination task) 
were more effective in older than in 
younger preschool Ss. This finding 
supports the assumption, since it 
supports the mediational deficiency 
hypothesis. On the other hand, 
Spiker (1959) found that although 
pretraining with distinctive stimuli 
facilitated learning a discrimination 
with similar stimuli, there was no 
significant age difference in the effec- 
tiveness of the pretraining. If the 
pretraining resulted in the acquisition 
of a discrimination set, the data con- 
tradict the mediational deficiency 
hypothesis or the assumption that 
discrimination set (orienting re- 
sponse) involves verbal mediators. 
However, as Spiker noted, the pre- 
training may serve only to minimize 
failure-produced responses by mini- 
mizing failures. The failure-produced 
responses are incompatible with effi- 
cient discrimination performance and 
their occurrence results in inferior 
performance. His results, then, do 
not necessarily contradict the media- 
tional deficiency hypothesis. 


Transposition 

Kuenne (1946) hypothesized that 
possession of a concept of the relation 
between the stimuli in a discrimina- 
tion task would facilitate transposi- 
tion because the concept would 
mediate correct responses. Using the 
two-stimulus problem, she found that 
the frequency of transposition on a 
"far" test, several steps removed 
from the training stimuli on the 
stimulus continuum, increased with 
increasing age level, but her data 
suggested that possession of the con- 
cept by younger preschool Ss did not 
facilitate transposition as much as it 
did in older preschool Ss. Alberts 
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and Ehrenfreund (1951) obtained 
essentially the same results in a simi- 
lar study. Both of these studies, as 
well as others (Jackson, Stonex, 
Lane, & Dominguez, 1938; Terrell, 
1958; Terrell & Kennedy, 1957), 
found a high frequency of transposi- 
tion on the “near” tests (one step 
removed) in the two-stimulus prob- 
lem, with no significant age differ- 
ences. According to Spence's (1936) 
theory, mediation is not required for 
transposition on the near test; and 
Jackson et al. (1938) noted that 

although the subjects [3-6 years of age] 
may readily understand the concept “bigger 
than," they do not transfer by such verbal 


analysis on critical trials [on a near test] (pp. 
581-582). 


Hunter's (1952) study, designed to 
test absolute versus relative theories 
of transposition, apparently demon- 
strated transposition in preschool Ss 
who did not have the relevant con- 
cept. Hunter attempted to design 
the tasks in such a way that absolute 
theories would predict no transposi- 
tion, but the Ss may have acquired a 
discrimination learning set (violating 
the boundary conditions of absolute 
theories of transposition) since he 
trained them on three discrimination 
tasks before giving them the trans- 
position test. Shepard (1957) found 
that discrimination learning set was 
maximal in preschool Ss after they 
were trained to criterion on one prob- 
lem. Hunter's results might be at- 
tributed to a discrimination learning 
set, but they might also be accounted 
for by Stevenson and Bitterman's 
(1955) hypothesis that S transposes 
if he fails to discriminate between the 
training and test sets of stimuli. The 
latter possible explanation is less 
plausible than the former, since 
Hunter had marked differences be- 
tween the training and test stimuli 
in some conditions. 
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Plenderleith (1956) found no sig- 
nificant differences between normal 
and feebleminded Ss (mean mental 
age about 69 months) in the acquisi- 
tion of a discrimination learning set 
or in the subsequent acquisition of a 
discrimination reversal set (except 
when there was a long interval be- 
tween sessions). Although other 
studies (Ellis, 1958; Kaufman & 
Peterson, 1958; Koch & Meyer, 1959; 
Stevenson & Swartz, 1958) have 
found a relationship between mental 
age and speed of acquisition of dis- 
crimination learning set, there is no 
evidence that verbal ability is di- 
rectly involved. Therefore, if the 
transposition in Hunter's (1952) Ss 
resulted from discrimination learning 
set rather than the usually postu- 
lated mechanisms, his finding that 
preverbal Ss transposed would not be 
relevant to the present topic. In sup- 
port of this conclusion, Levinson and 
Reese (1961) found no significant age 
difference in the speed of acquisition 
of discrimination learning set, indi- 
cating that it is at least less dependent 
on age level than verbal mediation is. 

Three studies of transposition in 
the intermediate-size problem tested 
the distance effect (i.e., the decrease 
in frequency of transposition with in- 
creasing separation between the train- 
ing and test stimuli on the stimulus 
continuum) in young children. Reese 
(in press) and Rudel (1960) found no 
significant difference between pre 
school Ss who had the concept © 
middlesizedness, or at least the con- 
cept name, and those who did not. 
significant distance effect was ob- 
tained in both studies. Similarly, 
Reese (1961) found that on a far test 
(three steps removed from the train- 
ing stimuli) younger (preschool) Ss 
transposed only when the area ratio 
of the stimuli was small, i.e., when 
the transposition did not theoreti- 
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cally require mediation (Stevenson & 
Bitterman, 1955), though older (kin- 
dergarten) Ss transposed on the far 
tect even when the area ratio of the 
stimuli was large, when mediation 
was presumably required. The results 
of the last study support the media- 
tional deficiency hypothesis and 
the Stevenson-Bitterman hypothesis. 
The obtaining of the distance effect 
in both "concept" and “no- concept“ 
groups in the first two studies can be 
explained by the Stevenson-Bitter- 
man hypothesis if it is assumed that 
mediation did not occur in the con- 
cept Ss. 

Spiker, Gerjuoy, and Shepard 
(1956) interpreted their results as 
indicating a greater frequency of 
transposition in concept than in no- 
concept Ss, but their procedure was 
such that mediation need not be 
assumed to have occurred, since 
acquired distinctiveness of cues could 
account for their findings. That is, in 
their study learning could have been 
facilitated by acquired distinctive- 
ness in the concept group, even if 
mediation did not occur, but there 
would be no acquired distinctiveness 
in the no-concept group. 


Acquired Distinctiveness 


Studies of acquired distinctiveness 
of cues (stimulus pretraining) in 
young children have uniformly found 
no significant deficiency in the effec- 
tiveness of the pretraining for the 
younger Ss. Norcross and Spiker 
(1957) found that younger preschool 
Ss made fewer correct responses than 
older ones, but stimulus pretraining 
was equally effective in both groups, 
i.e., age did not interact significantly 
with experimental conditions. Spiker 
(1956) found that stimulus pretrain- 
ing was more effective in younger 
than older preschool Ss. The younger 
control group was inferior to the 
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other groups, but the other groups 
performed at about the same high 
level. Weir and Stevenson (1959) 
obtained a similar result with their 
preschool Ss, but the interaction be- 
tween age and experimental condi- 
tions was apparently not significant. 
Finally, Cantor (1955) found no sig- 
nificant effects of age levels in his 
study of stimulus pretraining in pre- 
school Ss. 

The interpretation of the results of 
these studies must take possible ceil- 
ing effects into account. The maxi- 
mum mean percentages correct re- 
sponses in the studies were: Norcross 
and Spiker, 83%; Spiker, 86%; Weir 
and Stevenson, about 90%; and 
Cantor, 79%. It appears that there 
were ceiling effects, particularly in 
Spiker and Weir and Stevenson. If 
there were ceiling effects, the effec- 
tiveness of stimulus pretraining for 
the older experimental groups would 
have been obscured. It is also pos- 
sible that the older control Ss used 
pre-experimentally acquired names 
for the stimuli, obscuring the stimu- 
lus pretraining effect in the older 
experimental groups. In either case, 
no definite conclusion could be made 
about the relative effectiveness of 
the pretraining in younger and older 
preschool Ss. 

Acquired distinctiveness does not 
involve mediation, according to Dol- 
lard and Miller's (1950) interpreta- 
tion, and therefore a failure to find a 
deficiency in younger preschool Ss 
would not contradict the media- 
tional deficiency hypothesis. How- 
ever, Spiker (1956) has suggested 
that acquired distinctiveness may 
result from the use of the stimulus 
names for rehearsal of the stimulus- 
response connections during the inter- 
stimulus interval. His interpretation 
requires mediation, since the rehearsal 
would not facilitate performance un- 
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less the stimulus name mediated the 
appropriate response when the stimu- 
lus was presented. According to his 
interpretation, then, a failure to find 
a deficiency in younger Ss would con- 
tradict the mediational deficiency 
hypothesis. Since the results of the 
previous studies are inconclusive be- 
cause of the possibility of ceiling 
effects and the possible occurrence of 
pre-experimentally acquired stimulus 
names, it is apparent that further 
study of the acquired distinctiveness 
of cues is required. 

There is no clear-cut evidence re- 
garding the mechanisms which have 
been assumed to underlie acquired 
distinctiveness. Jeffrey (1958a) found 
that learning names for two stick 
figures, one pointing to the right and 
the other to the left, was facilitated in 
preschool Ss by learning to push 
buttons toward which the figures 
pointed. The response unit was 
present during the transfer task, and 
Jeffrey reported that one S lifted the 
appropriate shoulder before naming, 
but other Ss would look at the ap- 
propriate button before supplying 
the name of the figure" (p. 274). 
Whether mediation or acquired dis- 
tinctiveness was involved is not clear. 
Jeffrey (1958b) has also obtained 
facilitation of learning to associate 
buttons with piano tones by pretrain- 
ing Ss to match the tones with a 
piano or by singing. In this study, 
however, the facilitation may have 
resulted from the development of a 
discrimination set. In studies in 
which the response unit was not pres- 
ent during the transfer task, pretrain- 
ing with motor responses has not 
produced facilitation. For example, 
Murdock (1958) found that motor 
pretraining did not produce facilita- 
tion in college students; and Reese, in 
an unpublished study, found no facili- 
tation of fifth graders’ learning to 
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associate buttons with colored stim- 
uli following pretraining in which the 
stimuli were associated with switch 
throwing responses. Although the Ss 
of the former and latter pairs of 
studies differ in age, the presence or 
absence of the response unit during 
the transfer task may be the critical 
variable determining whether or not 
facilitation occurs on the transfer 
task. 


Acquired Equivalence 


There have been only two studies 
of the acquired equivalence of cues in 
preschool Ss, and both of these also 
involved acquired distinctiveness. In 
both studies it was impossible to 
separate the effects of acquired 
equivalence and distinctiveness ex- 
perimentally or statistically. Jeffrey 
(1953) reported that pretraining with 
verbal and motor responses led to 
greater facilitation of performance in 
older than in younger preschool Ss 
(the groups were divided on the basis 
of MA, which apparently also yielded 
a division on the basis of CA); and 
Shepard's (1954) data implied a 
similar result, since she obtained à 
correlation of .70 between errors on à 
transfer task and trials to criterion on 
the name learning task. Although 
she did not report the correlation be. 
tween age level and trials to criterion, 
age level should be negatively corre- 
lated with trials to criterion and there- 
fore negatively correlated with errors 
on the transfer task. Since studies d 
acquired distinctiveness, which does 
not necessarily involve mediation, 
have found no deficiency in younger 
Ss (see above), the Jeffrey and 
Shepard studies may be interpreted 
as indicating a deficiency in younger. 
Ss in acquired equivalence, which 
does require mediation, supporting 
the mediational deficiency hypothe- 
sis. 7 
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Double Alternation 


Although it is usually considered 
that a series of double alternations 
requires mediation, Hunter and 
Bartlett (1948) found that of 11 Ss 
below the age of 48 months, 9 failed 
to reach criterion on a double alterna- 
tion (the 2 who reached criterion were 
43 and 45 months of age), but no S 
younger than 60 months gave the 
basis of responding either spontane- 
ously or in response to questions 
asked at the end of training. Stolurow 
and Pascal (1950), studying double 
alternation in mental defectives, also 
reported that the solution of the 
problem did not always indicate 
ability to verbalize the correct pat- 
tern of response. The implications of 
these results are inconclusive, how- 
ever, since Bugelski and Scharlock 
(1952) have shown that even college 
students may mediate without being 
able to verbalize the process, i.e., 
without awareness. If mediation oc- 
curred without awareness in the 
younger Ss who reached criterion in 
the Hunter and Bartlett (1948) and 
Stolurow and Pascal (1950) studies, 
and if the other younger Ss also 

d the concepts required 
(‘‘left” and *right"), the results may 
be interpreted as supporting the 
mediational deficiency hypothesis. 


CONCLUSION 


Studies of reversal and nonreversal 
learning, transposition in the two- 
stimulus and intermediate-size prob- 
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lems, acquired equivalence of cues, 
and possibly other problems indicate 
that there is a deficiency in mediation 
in young children, compared with 
older children. The studies reviewed 
above indicate that the critical age 
for the occurrence of mediation may 
be different for different experimental 
situations and for different concepts. 
It seems likely that in some cases the 
deficiency is a characteristic of an 
early stage of human development, 
but that in others it may be a char- 
acteristic of an early stage of concept 
formation. There is some evidence 
that inadequately learned stimulus 
names, if used for rehearsal (as sug- 
gested by Spiker, 1961), produce 
interference. Reese's (1960) data 
suggest such a trend in fourth, fifth, 
and sixth grade school children, and 
McCormack's (1958) study suggests 
it in college students. It is proposed, 
then, that with a well-learned concept 
there is no necessary deficiency in 
mediation as a function of age, but 
with a less well-established concept 
there is a deficiency at any age. (For 
a discussion of the possible sources of 
deficiency with inadequately 
cancepts, see Spiker, 1961.) 

If mediation is a “voluntary” proc- 
ess, as rehearsal is, there may be a 
stage of development in which Ss 
have typically not yet learned to use 
it, and instruction in the use of the 
process should facilitate the learning 
of these Ss. If it is an involuntary or 
automatic process, instructions should 
have no effect. 
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One of the main variables which 
affect the critical flicker-fusion fre- 
quency, CFF, is the proportion of 
light to the total cycle, Px. This 
determinant is usually indicated by 
the term "light:dark ratio," LDR. 
Although it is completely specified by 
either of the two expressions LDR or 
Pp, the latter will be used here, for 
reasons given later. Landis (1954) 
concluded from a survey of work on 
the problem that the relationship 
between CFF and Py was confused 
and stated that ‘‘no-one has offered 
the beginning of a solution“ (p. 274). 
He found that inconsistent results 
made theoretical interpretation im- 
possible. 

'The present paper will indicate 
that consistent results are now emerg- 
ing and will evaluate two approaches 
to their representation and interpre- 
tation. The first approach is the con- 
ventional one which plots CFF scores 
in cycles per second. The second ap- 
proach, not previously described, 
uses different functions of the same 
data to give simpler relationships be- 
tween variables. These relationships 
can in turn be related to other visual 
tasks and make possible further tests 
of hypotheses on the nature of the 
mechanisms of visual fusion. 


TERMINOLOGY 


CFF is defined as the frequency at 
which an intermittent light appears 
fused on 50% of trials. Terminology 
and relationships between CFF vari- 
ables can be set out in two sections, 
one applying at all constant frequen- 
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cies and the second to the particular 
frequency of CFF, the latter being 
specific cases of the former. The time 
values are expressed in seconds or 
milliseconds. 

1. At all constant frequencies: 
Py⸗ proportion of light to cycle, 
Pp= proportion of dark to cycle, 


P + PDP = 1 
LDR light: dark ratio, 
Pr Br 
Po- 1 P 


t'o— time of one cycle, t;— time of 
one light flash, t^p —time of one dark 
period, 
tc — tntto 

2. At critical flicker-fusion fre- 
quency: tc critical cycle time, tr= 
time of one light flash, tp= time of 
one dark period, 


to = tr + to 
If CFF and Py are known, then the 


values of ty, and tp may be found by 
the following formulae: 


1 
ty Fr. CF 
1 
tp = PD‘ CRF 
c LETT: 
Further relationships between 


these variables can be derived but the 
most important have been given 
above. 


nne | 


LIGHT TO CYCLE AS DETERMINATION OF CFF 


Different expressions have been 
used by various authors for the con- 
cepts of te and Pr as defined above. 
For example, Brown and Forsyth 
(1959) used the term "'period" in- 
stead of tc while Bartley has used 
several terms for the concept. These 
include: critical flash interval (1937), 
critical flicker interval (1951), and 
cycle length and intermittency cycle 
(1958). The term “critical cycle 
time" is used here because it conveys 
the exact meaning required with em- 
phasis on time at fusion. 

The expressions used to denote Pr 
include: light:dark ratio (Landis, 
1954; Ross, 1938; Winchell & Simon- 
son, 1951; etc.), flash duration or 
pulse-to-cycle fraction (Bartley, 1937, 
1958), relative duration of light in- 
terval or light flash duration (Crozier 
& Wolf, 1941), and light-time propor- 
tion (McFarland, Warren, & Karis, 
1958). The most common of these 
terms, light:dark ratio, is often er- 
roneously applied to P; values. Thus, 
Lloyd and Landis (1960) defined 
light:dark ratio as duration of light 
interval/total cycle time (p. 334), 
which corresponds to the definition of 
Pr given above. The values they 
gave as LDR's, viz., .005, .01, .25, 
.50, .75, and .95 are Pr values, not ra- 
tios of light to dark periods. Sim- 
ilarly, Landis (1954) applied the 
term, light:dark ratio, to values 
ranging from .1 to .9 and Battersby 
and Jaffe (1953) to the figures 20%, 
50%, and 80%, which are obviously 
Pr values. This incorrect use of the 
term, light:dark ratio, is confusing; 
particularly since it was correctly de- 
fined by the authors in other sections 
of their reports. 

Bartley and Nelson (1960b) sup- 
ported an earlier use of the term, 
pulse-to-cycle fraction, instead of 
light:dark ratio (Bartley, 1958), with 
the objection that “ ‘light’ and ‘dark’ 
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are experiential response terms, not 
stimulus terms" (p. 241). This does 
not seem to be a valid or sufficient 
reason for rejecting the term, light: 
dark ratio, whereas the confusion 
arising when LDR is used also to 
mean P; is decisive. It is simpler to 
express and easier to comprehend im- 
mediately PT values such as .3, .5, 
and .7 than the equivalent LDR 
values of 3/7, 1, and 7/3. The term 
P, is also preferable to any of the pre- 
vious terms in that it can be incor- 
porated in aj consistent terminol- 
ogy as is shown above. The use of Pr, 
and Pp to refer to proportions, and of 
ti and tp to refer to times at fusion, 
avoids the ambiguity which arises 
when tz and tp are used for both pro- 
portions and times as was done by 
Crozier and Wolf (1941) and Rutsch- 
mann (1955). 


BRIGHTNESS COMPENSATION 


When Pr is altered systematically, 
the factor of brightness compensation 
must be considered. Talbot’s Law 
(or the Talbot-Plateau Law) states 
that the apparent brightness at 
fusion, i, of a test patch, is determined 
by the product of the maximum 
luminance, I, and Pz; that is, 


i- IL PI 


This equation has been shown to 
hold only at and above fusion fre- 
quency. If I is changed as Py is 
changed, so that the fused bright- 
ness, i, as defined by Talbot's Law, 
remains constant, then this is referred 
to as a compensated experiment 
(Cobb, 1934; Lloyd & Landis, 1960; 
Ross, 1938; Winchell & Simonson, 
1951). On the other hand, if I is kept 
constant so that fused brightness, i, 
changes with Pr, then this is referred 
to as an uncompensated experiment 
(Bartley, 1937; Battersby & Jaffe, 
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1953; Crozier & Wolf, 1941; Ross, 
1943). 

The advantage of compensation is 
claimed to be (Bartley, 1941) that it 
allows "the effect of the temporal 
course of the stimulus pattern alone 
on c.f.f. to be determined" (p. 121) 
since brightness changes at fusion are 
controlled to give a constant average 
amount of light. However, altering I 
to keep i constant in accordance with 
Talbot's Law causes differences in 
apparent brightness below fusion. 
Crozier and Wolf (1941) used these 
differences below fusion to argue 
against applying Talbot’s Law in 
CFF studies. They believed the dark 
period to be important and claimed 
that compensation obscures the ef- 
fects of light and dark intervals as 
well as giving rise to unsymmetrical 
CFF-Log I curves. They asserted 
that it is maximum luminance and 
not fused brightness which is crucial. 

Since, by definition, at CFF the 
subject reports flicker and fusion 
equally frequently, there would ap- 
pear to be no a priori reason for se- 
lecting either (a) constant apparent 

brightness above fusion with variable 
luminance, or (b) variable apparent 
brightness above fusion with constant 
luminance. The choice must surely 
be made on empirical evidence as to 
which conditions produce the sim- 
plest, most meaningful curves that can 
be related to other visual phenomena. 


First APPROACH 
Shape of Curves 


Recent experiments on the CFF- 
P; relationship (Bartley & Nelson, 
1961; Lloyd & Landis, 1960) confirm 
the consistent patterns emerging in 
spite of differences in stimulus condi- 
tions. The contradictions found by 
Landis in studies prior to 1954 have, 
by systematic research, been attrib- 
uted mainly to the factors of com- 


pensation, differences in luminance 
and areas, or to a combination of 
these. It is now clear that at low 
luminance levels (uncompensated), 
the curve of CFF against Pr is bowed, 
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Fic. 1. CFF versus Px for selected uncom- 
pensated luminances—in trolands. (Data for 
subject D. D., Lloyd & Landis, 1960.) 


.CYCLES/SEC. 


CFF 


Fic. 2. CFF versus Px for selected com- 
pensated luminances—in trolands. (Data 
as for Figure 1.) 


with maximum CFF occurring at a 
Py of .5, while at high luminance 
levels, the maximum is at a low + 
Compensation tends to straighten 
the curves since it requires higher 
luminances at low Px’s and therefore 
raises the CFF scores at these values; 
compensation causes progressively 
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less difference in CFF scores as Pr is 
increased. An increase in area raises 
all curves on the ordinate. 

The general shape of CFF-P; 
graphs at different luminance levels is 
illustrated in Figure 1 for uncom- 
pensated, and Figure 2 for compen- 
sated luminances. The values used 
were estimated from the data given in 
a report by Lloyd and Landis (1960) 
and are for conditions of 1° test patch 
area and log luminances as labeled for 
only one subject, D. D., but his results 
are considered typical. (The authors 
give all luminance levels for Pr =.5 as 
.03 log units lower than those for all 
other Pr values. This unexplained 
difference is allowed for in the figures 
but it causes the values for P =. & to 
be approximate.) 

Ross (1938, 1943) compared the 
compensated and uncompensated 
methods. His results were used by 
Landis (1954) who, unfortunately, re- 
versed the labels on the graphs and 
his subsequent discussion is therefore 
misleading. If the graphs are cor- 
rectly labeled, it can be seen that as 
in Figure 2, compensation straightens 
the curves of CFF against Pr. 


Bartley's Model 


'The majority of studies on the re- 
lationship between Pr and CFF re- 
ported in the last few years have been 
by Bartley and his co-workers (Bart- 
ley, 1958; Bartley & Nelson, 1960b, 
1960c, 1961). Bartley (1958) pro- 
posed ‘‘a conceptual model of CFF” 
which was based on an earlier finding 
(Bartley, 1937) that when CFF is 
plotted against luminance, the curves 
for various Py values cross. Bartley 
emphasized that at the intersection of 
two Py curves, this particular com- 
bination of frequency and luminance 
has two Py values which are equiva- 
lent in producing fusion. He also 
pointed out that when CFFis plotted 
against Pr for different luminance 

e * 
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levels (as in Figure 1), the graphs are 
curved so that for a particular fre- 
quency and constant luminance, 
again two or more P;'s on the ab- 
scissa produce fusion. Bartley's pre- 
dictions about the shape of these 
CFF-P; curves will be discussed in 
this paper but none of his neuro- 
physiological theorizing will be in- 
cluded. 


CFF 


R 


Fic. 3. Schematic curves of CFF versus Pr 
as predicted for two methods by Bartley's 
model. 


Bartley's earlier model (1958) 
states that for a constant frequency 
(or cycle time) at an intermediate 
luminance, a very low Pr always pro- 
duces flicker. As Pz is increased, with 
the cycle time remaining constant, 
his model predicts that the subjects' 
reports change to fusion, then to 
flicker, and finally to fusion. This is 
shown schematically in Figure 3 in 
which A, B, and C indicate the three 
transition points expected. The curve 
itself represents the CFF values 
where, by definition, flicker and 
fusion are reported on half the trials 
so that fusion is reported at points 
above the curve and flicker at points 
below the curve on more than half the 
trials. 

Bartley (1958) stated that each 
transition A, B, and C "represents 
what we ordinarily... call critical 
flicker frequency" (p. 112). How- 
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ever, in the introduction to a later re- 
port, he and Nelson (1960b) stated 
that only transitions A and C (Figure 
3) "represent the conditions we 
usually label as CFF,” but transition 
B, from fusion to flicker, “has no 
name since no-one else, to our know- 
ledge, has either entertained such a 
transition or obtained it. We might 
call it FFF, fusion-to-flicker fre- 
quency" (pp. 241-242). It seems un- 
necessary to give a new name to this 
fusion-to-flicker transition, since by 
definition all points on the curve are 
CFF values and there is no evidence 
that different results are obtained 
when Py, is changed at constant fre- 
quency than when frequency is 
changed at constant Px. 

Within a range of ‘‘not-too-short”’ 
cycle lengths and at a medium lumi- 
nance level, Bartley's model postu- 
lates two or three transition points de- 
pending on the method of limits used. 
Two methods are discussed or im- 
plied: 

1. To hold cycle time constant and 
vary Pr 

2. To hold Pr constant and vary 

cycle time (i.e., frequency) 
Each of these two procedures can be 
performed in two different directions: 
(a) from flicker to fusion, or (b) from 
fusion to flicker. The model makes 
no prediction of the number of transi- 
tion points when the constant meth- 
od is used. 

Bartley (1958) described Method 
1 and claimed that it should yield 
three transitions, but in the later 
article, he and Nelson (1960b) stated 
"by the present method of going 
from flicker to fusion, only two cross- 
ings under any single combination of 
intensity and cycle lengths are to be 
expected" (p. 244). The implication 
is that if Method 2a is used, the 
CFF-P; graph will appear as dotted 

in Figure 3. The hypothesis that dif- 
ferent curves are obtained by different 


ADRIENNE THROSBY 


psychophysical methods was not 
made explicit then by the authors, 
but they did use what they termed 
the direct“ and “inferential” meth- 
ods (i.e.; Methods 1 and 2, respec- 
tively) in their latest report (1961). 
The evidence from this and other 
studies discussed below, indicates 
that not only do the two methods 
yield similar results but that the 
model's expectation of flicker at very 
short P,'s is not confirmed in any 
study. 

Lloyd and Landis (1960), for ex- 
ample, using Method 2 and the low- 
est PI values to date, namely .005 
and .01, did find very slight reversals 
in the direction of the curve at these 
values, but closer examination indi- 
cates that the inversions apparently 
occur randomly and are almost cer- 
tainly experimental error. There are 
13 out of 26 possible inversions and 
these occur at high as well as medium 
luminances. But of these 13 inver- 
sions, 12 occur for the 1? and only one 
for the 2? area, and 10 of the inver- 
sions are reported by one subject. 
This lack of generality must preclude 
the results from supporting the model 
which requires consistent inversions 
under specified conditions. ? 

Bartley and Nelson's article 
(1960b), publishing results for seven 
subjects, gave no details of the ob- 
servers apart from noting that two 
were the authors. Possibly, all the 
other subjects were untrained or 
naive, since their curves are reminis- 
cent of those given by Nelson, Bart- 
ley, and de Hardt (1959) in their 
article on variability, although stand- 
ard deviations are not reported. 
However, the curves of their last sub- 
ject, BAR, presumably Bartley and 
therefore a trained subject, are the 
most regular and the most like other 
authors' results. By the same token, 
they also do not conform to expecta- 
tions of Bartley's model. 
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In their second report for the two 
trained subjects from the previous 
report, Bartley and Nelson (1960c) 
draw graphs of CFF plotted against 
P and on each graph there are one or 
more horizontal lines (such as shown 
in Figure 3) at apparently arbitrarily 
determined frequencies. Numbers 
are given both to the point where the 
lines did cut the curves (‘jogs’) and 
also where they did not but were ex- 
pected to by the model. At the two 
intermediate luminances where the 
model should hold, there is no indica- 
tion of an inversion at the lowest Pr, 
but a number is given in each case to 
its predicted crossing point. 

Bartley and Nelson (1961) sug- 
gested that one reason for the absence 
of expected inversions in their results 
might have been that the P, values 
were not low enough, but even Lloyd 
and Landis’ values of .005 and .01 
did not produce consistent inversions. 
The other explanation which the 
authors believe is “far more plausi- 
ble" (p. 44) appears to be that the 
model as formerly stated is not ap- 
plicable at low and medium lumi- 
nances but “it is only at very high 
intensities that short pulses produce 
flicker" (p. 45). This change of hy- 
pothesis was based on the occurrence 
of small inversions in the curves of 
CFF against Pr for four out of five 
subjectsat the highest luminance level 
but only with the inferential method. 
The authors do not point out that 
there is no evidence of any inversions 
using the so-called direct test of the 
model, Method 1. 

Bartley and Nelson make the un- 
founded assumption that CFF curves 
must be irregular and state that no 
one has previously incorporated these 
"irregularities" or inversions into à 
model (1960c, p. 6). They errone- 
ously claim that there were inversions 
in both their own data and those 
of Ross (Bartley & Nelson, 1960b, p. 
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244; 1960c, pp. 6-7). However, the 
above review of experiments shows 
that the inversions or irregularities 
have not been consistently demon- 
strated under the required conditions 
in any study. 


A SECOND APPROACH 


The relationship between CFF and 
Pi can be examined by a different 
method not previously investigated, 
using not critical frequency in cycles 
per second but the constituent values 
of critical cycle time, viz., light time 
and dark time at fusion as defined in 
the section on terminology. Just as 
CFF is the critical number of cycles 
per second at fusion, 80 tp is the 
critical duration of the dark time at 
fusion. An alternative definition of 
CFF to the one given earlier is the 
frequency at which the dark time be- 
tween light pulses can be discrimi- 
nated (i.e., flicker reported) on half 
the trials. Since the cycle time (recip- 
rocal of CFF) is the sum of two vari- 
ables, ti and tp, and on the assump- 
tion that the discrimination of tp is 
important, these time values can be 
used in the graphical representation 
of results instead of the more usual 
CFF scores. The significance of tp as 
a variable is that replotting CFF data 
in terms of tp gives graphs which not 
only appear simpler than those plot- 
ted with CFF in cycles per second but 
are also related to dark threshold 
curves discussed below. 

This approach can be shown in the 
results of any studies of the CFF-P; 
relationship but those of Lloyd and 
Landis (1960) will be used again. 
Figure 1 shows results of this study 
plotted for CFF against PI (uncom- 
pensated) and Figure 4 the same re- 
sults with Log tp plotted against tr. 
It can be seen that this second 
method gives almost linear relation- 
ships. Further graphs can be plotted 
in this manner, for example, Log tp 
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Fic. 4. Log tp versus ti for selected uncom- 
pensated luminances—in trolands. (Data as 
for Figure 1; dotted curve gives values for 
Pr-0.75.) 


at fusion against area, luminance, or 
Pr and in each case a regular and 
simple relationship is found. (The 
logarithms of tp values give the best 
approximation to linearity as may be 
expected by comparison with other 
visual functions.) It should be noted 
that with this method of plotting re- 
sults, all values with the same Pr will 
fall on a smooth curve. The curve for 
PI 75 has been shown in Figure 4. 

It is clear from Figure 4 that in- 
creasing time of light and/or lumi- 
nance decreases tp. The same can be 
shown for increasing area of the test 
patch (unpublished results of the 
author). It seems that an increase in 
the total excitation level, whether 
through time, luminance, or area in- 
creases the processes, probably in- 
hibitory, responsible for perception of 
flicker. 

This observation is supported by 
studies on the “dark threshold” de- 
fined here as the dark interval neces- 
sary for the fusion of two or more 
successive flashes, as distinct from 
"dark time" which refers to the dark 
interval at CFF, A summary of dark 
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threshold studies (Bulanova & Lui- 
zov, 1954; Dunlap, 1915; Granit & 
Hammond, 1931; Lichtenstein & 
Boucher, 1960; Mahneke, 1958) is in- 
cluded here since they illustrate a 
simpler fusion situation which gives 
similar results to those from CFF 
studies and emphasizes the impor- 
tance of dark time to both situations. 
As early as 1915 Dunlap attempted 
to estimate the shortest perceptible 
time interval between two flashes of 
light, and although a second aim was 
to ascertain the relationship between 
dark interval and CFF this was not 
carried out. Dunlap used an epis- 
cotister for the tests and was both 
subject and experimenter. The 
brightness of the light source was not 
given but must have been fairly low 
since the highest CFF was for PT 
5. Dunlap's main result was the de- 
crease of the dark threshold with in- 
crease in the length of the first flash. 
A comparable result was obtained by 
Bulanova and Luizov (1954) who de- 
termined dark thresholds for six ob- 
servers. They varied exposure time 
and luminances of the light source and 
surround and one of their main find- 
ings was a consistent decrease in 
dark threshold with increase in the 
luminance of the light source. 
Mahneke (1958) has improved and 
enlarged both these studies. He 
estimated the dark thresholds for 17 
different numbers of successive light 
flashes (2-100) and 6 different flash 
durations (1, 2, 5, 10, 20, 50 msec-). 
He again found that dark threshold 
decreases as light time increases. 
Mahneke pointed out that although 
dark time must be an important de- 
terminant of CFF, most studies vary 
times of light and dark together, 
whereas only Dunlap and he up to 
that time had varied these inde- 
pendently of each other. He showed 
that an increase in the number o 
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flashes decreases the dark threshold 
and that an increase in the duration 
of each flash in a series causes a de- 
crease in the dark interval. 

Figure 5 shows Mahneke's results 
replotted as Log tp against ty for 
varying times of exposure. He re- 
ported that "the chief reduction in 
the dark interval takes place dur- 
ing the first approx. 500 msec." (p. 
17) and quotes Granit and Ham- 
mond's study (1931) in support of 
this. However, Figure 5 shows more 
clearly than. Mahneke's graph that 
tp is still decreasing with exposure 
times up to at least 2 seconds under 
the conditions of his experiment, but 
any longer exposure times would 
probably not improve the ability to 
discriminate much further. That the 
curves for 1- and 2-second exposures 
are almost identical and approaching 
linearity is important confirmation of 
the relationship between CFF and 
dark interval, since exposure times 
from 1 to 2 seconds are most common 
in CFF experiments which use the 
discontinuous method of limits (see 
continuous method of limits (see 
Simonson & Brozek, 1952) or the 
method of constant stimuli (author's 
unpublished results). 

Lichtenstein and Boucher (1960) 
pointed out the importance 
"critical duration" in various visual 
phenomena, such as occurs in the 
summation of subthreshold stimuli, 
the Bunsen-Roscoe Law, etc. In 
each of these cases a limiting (critical) 
value exists for the temporal effects 
of light energy. The authors studied 
the critical duration, not of “on” 
stimuli but of the dark interval be- 
tween stimuli or the "minimum de- 
tectable dark interval." The differ- 
ence between this and previous 
studies such as Mahneke's was that 
the light flashes instead of being 
physically continuous were them- 
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duration was found. 
Discussion 
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of curves at different luminance 
levels is obtained. This regularity 
provides a better understanding of 
the effect of the changes in Py on 
visual fusion than do the more usual 
CFF-P; graphs. Bartley's model for 
the first approach has been discussed 
in detail as this is the only recent sys- 
tematic attempt to account for re- 
sults. However, it is not supported 
by empirical evidence nor does it 
seem likely to be in its subsequent 
altered form. 

The graphs of Log tp against tr 
obtained from studies of CFF and 
Py clearly show that increase in time 
of light, luminance, and/or area of 
the test patch, results in a decrease in 
the time of dark necessary at fusion. 
Similar results have been found in 
dark threshold studies and the con- 
clusion to be drawn is that any in- 
crease in the total level of stimulation 
reduces the dark time. This seems to 
involve an inhibitory process and this 
conclusion is supported by the work 
of Granit (1947, 1955). His findings 
on the electroretinogram of man show 
that increasing the level of energy in 
each flash, whether by area, lumi- 
nance, or time, uncovers“ the off- 
effect which Granit associates with 
inhibition. The off-effect has been 
shown to be related to better dis- 
crimination of flashes, so that shorter 
dark times or high CFF are the re- 
sult. 

The possibilities of neurophysi- 
ological mechanisms are only briefly 
stated here to indicate a basis of the 
tp-tr relationship; the main con- 
cern of the paper is with perceptual 
data and their presentation. How- 
ever, there is observational evidence 
to support the brief summary given 
above. Lichtenstein and Boucher 
(1960) used the fusion to flicker 
method of limits and as the epis- 
cotister disc speed slowed down to 
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the first report of flicker, the dark in- 
terval appeared as a dark pulse” 
which was most striking when the 
"on" stimuli were long. Mahneke 
(1958) mentioned a similar observa- 
tion: with a series of flashes each of 
50-millisecond duration, he found 
that a slight alteration in dark inter- 
val was enough to change the appear- 
ance from clearly fused to obviously 
flickering, while the same effect was 
not obtained with the same number 
of flashes but each of only one- 
millisecond duration. As Mahneke 
(1958) commented ‘‘this finding also 
indicates that the capacity of the 
human eye to discriminate successive 
light flashes increases with increasing 
quantity of light" (p. 16). 

'The second approach to the repre- 
sentation of CFF-P results leads to 
alternative conclusions in some stud- 
ies. For example, McFarland et al. 
(1958), in a study of age, CFF and 
Py, found that there is a larger differ- 
ence between age groups' CFF scores 
at low than at high Pr's and sug- 
gested that sensitivity of CFF to the 
aging process is enhanced at low 
Py's. Graphing their results using tr 
and tp values, however, shows that 
increase in age produces a constant 
increase in time of dark necessary at 
fusion for any given length of light. 

Further experimental work should 
produce reliable CFF scores at ex- 
treme Pr values which are needed, 
for example, to give conclusive evi- 
dence on Bartley's model. However, 
if an episcotister rather than elec- 
tronic apparatus is used there are 
difficulties in obtaining square light 
pulses at these extreme values (Lloyd 
& Landis 1960, p. 332), and it has 
been shown that if square pulses are 
not produced the CFF scores are 
affected (Bartley & Nelson, 19602). 

The present paper aims to show the 
usefulness of presenting CFF data 1n 
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terms of time relationships at fusion. 
Investigations of the effect of changes 
in Pr on CFF are important because 
these time functions are varied by 
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altering Pr and the graphs of light 
versus dark times lead to a better 
understanding of the phenomenon of 
CFF. 
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I will first state my bias. Going up 

in the elevator at the Men's Faculty 
Club at Columbia University years 
ago, I listened to Harold Urey, Nobel 
Prize winner, describing to his col- 
leagues how he viewed the problem of 
replication. 
You get an interesting result that you can't 
explain. You improve the conditions, put in 
little additional steps, so that your result will 
be clearer. You can't get the result that you 
got before. Then you say, “I'll go back and do 
it exactly the way I did it the first time." So 
you do that, and you don't get anything like 
what you got before. 


Everyone in science—even a hum- 
ble science like psychology and a still 
humbler one, parapsychology—has 
encountered what Urey was de- 
scribing. What is the course for the 
scientist under such conditions? Is it 
more constructive to say that, be- 
cause it could not be repeated, the ef- 
fect was never there? Or is it to say 
that it was an artifact? Or to say that 
it was just a challenge, something out 
on the edge of the known, waiting to 
be replicated? It seems to me that 
Girden feels that if a result cannot be 
replicated, it has to be written down 
asa fluke. My bias is to question this 
as the scientist's approach to PK or 
to anything else. 

Psychokinesis is “the direct in- 
fluence exerted on a physical system 
by a subject without any known in- 
termediate physical energy or instru- 
mentation.“ The commonest way to 
test for its presence is to ascertain 
whether dice, coins, etc., tumbled 

against barriers, come to rest with 


1 Definition in glossary of each issue of the 
Journal of Parapsychology. 
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those faces uppermost which the sub- 
ject wishes to come uppermost; or, 
in the case of placement PK,” 
wishes that the objects (dice, etc.) 
come to rest on one designated area 
and not on another. These are two of 
the three main PK effects reported in 
the literature. A third effect fre- 
quently reported is that successes 
("hits") are significantly more fre- 
quent in first runs, or in the first 
parts of a task, than later (the de- 
cline effect") where mean chance ex- 
pectation would yield a horizontal 
straight line with no significant slope. 
It is Girden's task to review the evi- 
dence for these three effects. 

'To begin with something concrete: 
McConnell, Snowden, and Powell 
(1955) (Department of Biophysics, 
University of Pittsburgh) undertook 
to replicate a series of Duke Uni- 
versity studies showing (a) excess of 
hits where wishing for the various 
die-faces, (b) decline effects. They 
used a motor driven cage which 
whirled dice against a system of in- 
ternal baffles, and as the cage came to 
rest at the end of each 180? turn, 
automatically photographed them. 
At various times subjects determined 
in advance for which die-face to 
"wish." The hits were not signifi- 
cantly high, but the decline effects 
were significant (p—.002). Granting 
that much of the early Duke Univer- 
sity work of J. B. Rhine (1943, 1944, 
1945) and others used loose methods, 
with hand thrown and cup thrown 
dice, warranting only a preliminary 
hypothesis, the first serious problem 
would seem to be this replication by 
McConnel et al. The answer, so far, 
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is partially positive. A second ques- 
tion relates to further replication, us- 
ing this same equipment. Dale and 
Woodruff (1947), using the same 
equipment and also other equipment, 
did mot get significant results either in 
total hits or in decline curves. Gir- 
den's feeling seems to be that since, 
when using the same equipment, two 
investigators did not get a successful 
replication, the McConnell data leave 
no positive residuum of evidence for 
the effect. The value of .002 for the 
decline effect reported by McConnell 
and collaborators to confirm Rhine's 
various studies, would ordinarily be 
regarded as safely significant, and 
even if a correction is introduced 
owing to the fact that this is a tse- 
lected” result, not replicated by Dale 
and Woodruff, some would regard the 
hypothesis as tentatively — con- 
firmed.” But let us begin by admit- 
ting different approaches to proce- 
dures of this sort. 

The same logic would appear to 
apply to an earlier replication study 
by Dale (1946). This study, with 
equipment described below, entailed 
the assignment of subjects to groups 
who tried, respectively, for sixes, fives, 
fours, threes, twos, ones in a prede- 
termined order so that dice biases 
which would favor higher scores dur- 
ing part of the work would be bal- 
anced by scores disfavored by such 
bias, and cutting across both indi- 
vidual subjects (all subjects tried for 
all faces in turn) and permitting the 
analysis of decline curves for each 
subject. The excess of hits was signifi- 
cant at the .005 level. The type of de- 
cline curves previously noted (consist- 
ent declining from beginning to end 
of a task) did not appear, although a 
similar phenomenon appeared: most 
of the excess of hits was in the first 
portion of the subject's work (begin- 
ner's luck"). The dice were cast down 


a chute with many baffles which 
tossed them about, so that on the first 
run, as on the last, their position after 
coming to rest was (normally) wholly 
unpredictable. Dale added one other 
check for internal consistency. She 
did an odd-even correlation by sub- 
jects, showing that the same subjects 
who scored high on odd numbered 
units of the work tended to score high 
on the even numbered units of the 
work, and so on, just as if we were 
dealing with tests of perceptual or 
motor skill. It is difficult to see how 
this measure of internal consistency 
could be due to biased dice, if one re- 
calls that the irregularities or wear 
and tear in the dice are a function not 
of individuality in the psychological 
sense, but of physical attributes 
which cut across such individuality. 

I have said more than Girden says 
about the McConnell et al. studies 
and the Dale studies. Indeed, in 
many cases his accounts of experi- 
mental methods are so brief that it is 
hard to comprehend what the experi- 
menter was trying to do, what his 
methods were, and what results he 
obtained. In the study by Binski 
(1957), dealing not with dice, but 
with coins and a roulette wheel, one 
gets from Girden the impression that 
the subject's 10 throws of 100 coins 
each, sometimes trying for one face of 
the coin and sometimes for the other, 
gave a "run of luck," that is, within 
chance expectation. Thus if heads 
were predicted, there might well be 
several consecutive throws which 
gave more heads than tails. Actually, 
100 coins were thrown each time and 
there was one individual subject 
whose trials yielded the desired face 
in 548 out of 1,000 observations, 
which is a very significant number of 
hits; and he did as well or better in 
each of the next four throws. Statis- 
tical treatment must deal not as 
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Girden does with 10 trials, but with a 
trend involving 10X100 trials and 
large deviation or "antichance" 
value. Girden's report also fails to 
note that the same subject who is 
unique in these coin results was, by a 
wide margin, also unique in an experi- 
ment involving a standard roulette 
wheel, so that no gross averaging of 
either the roulette results or the coin 
results pooling all subjects would be 
permissible. In other words, there 
are in Binski's work two very signifi- 
cant studies of a certain individual, 
and this has somehow not appeared 
at all in the condensed statement 
which Girden offers. 
A comparable case is an experiment 
by Mangan (1954), which is put to 
one side by noting that it began with 
a solo“ series, in which the subject is 
his own experimenter. The point of 
such a solo series, however, is to get 
preliminary data, which are then to 
be extensively checked by having an 
independent observer and recorder in 
due time properly recorded. Mangan 
did this, and the high results were 
maintained. It is to be doubted 
whether Girden grasped the purpose 
of the experiment. Another interest- 
ing point that bears on the question 
of the way in which Girden worked: 
Mangan, noting the bias of ordinary 
dice, prepared to do two things; first, 
to make alternate throws for high and 
low dice (e.g., 6's versus 1's), taking 
an everyday working attitude, and 
dealing with totals in which the 
tendency of the low dice to go below 
chance could be balanced by the tend- 
ency of the high dice to go above 
chance, and obtaining a safe margin 
of significance for totals by playing 
one die bias against the other. Man- 
gan then noted the possibility that 
the human observer, as well as the dice, 
has biases, and offered some tentative 
data on the summation of such ef- 
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fects. Repeatedly in this report, in- 
deed, the experimenter's willingness 
to work with the normal situation of. 
dice bias, to stress it, and to make use 
of it for experimental purposes, leads 
only to Girden's comment that the 
dice were biased. , 

Now as Girden indicates, there has 
been controversy about much of the 
PK research since it was inaugurated 
in the '30s and '40s, and first saw 
publication in 1943. Vigorous attacks 
on “optional stopping," inadequate 
rotation of die faces, permission to 
subjects to call any die face they 
liked, and several other loose and 
casual procedures, have been rather 
prominent in the parapsychology lit- 
erature, and many of the defects have 
not been vigorously or consistently 
removed. Girden has done us a 
genuine service in calling attention to 
these unrectified errors. 

At the same time I would gladly 
agree that there is no published ex- 
periment on PK which involves all 
the features of an ideal“ experiment 
as follows: (a) large enough mass of 
material to be treated by conven- 
tional statistical methods; (b) com- 
pletely impersonal, preferably me- 
chanical or electronic, devices for 
tumbling the dice, coins, etc., within 
a compartment from which they can- 
not escape but in which they can be 
photographed; (c) instructions to the 
subject to try now for a particular die 
face, now for another die face, the in- 
structions being mandatory and the. 
order of targets being determin 
independently of the subject's choices 
or habits; (d) offering photographs of 
the equipment with the subject in 
place and photographs of the dice 
when at rest, to be made available to 
all serious investigators who are 1m- 
terested; (e) the two major hypothe- 
ses, first anticipating an overall ex- 
cess of hits over mean chance expe 
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tation, and second, decline curves in 
terms of functional units of work 
done by the subject, should be ana- 
lyzed in the customary manner; (f) 
the matter of optional stopping ap- 
plied both to a terminal point for 
each individual subject and a ter- 
minal point for the selection of the 
last subject who is to be run. It 
would be tedious to set up a box 
score to indicate the degree of compli- 
ance of each experiment with these 
specifications. It may be noted that 
very few psychological experiments, 
if any, are set up in this way. The 
more customary procedure is for the 
experimenenter after reading criti- 
cisms of his work, to accede as far as 
possible to such requests, and this I 
believe the reader will find has been 
done in the case of the Dale and Mc- 
Connell experiments and the For- 
wald-Pratt experiments noted below. 
When we are dealing with "place- 
ment” effects—that is, an attempt to 
move the dice into a particular posi- 
tion rather than to make a certain 
face come up the most—the precau- 
tions will involve variations, but in 
general the principles just stated 
would appear to hold, and I am glad 
that Girden has stressed them. 

It is puzzling to find Girden taking 
the position with increasing stre 
toward the end of his paper that no 
psychological hypothesis is being 
tested and no psychological condi- 
tions investigated for their effects. 
Hypotheses regarding motivation, 
satiation for the task, anxiety and 
self-consciousness are invoked in 
many of these studies, sometimes 
stated before the event and then 
tested, and sometimes offered as 
afterthoughts, tentatively suggested 
as material for further work. In fact, 
in view of the frequency with which 
one hears the statement that there are 
no orderly systematic hypotheses in 
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parapsychology, one might be be- 
wildered to encounter the large num- 
ber of tested and testable hypotheses 
offered. 

After experimental data have been 
analyzed, the attempt to cast some 
light upon what may have been hap- 
pening is ordinarily considered rea- 
sonable, although of course a fresh 
test of any emerging hypothesis is 
needed. It seems a little unfair to 
refer to this standard psychological 
procedure as "post hoc rationaliza- 
tion" (Girden, 1962). Thus, in view 
of the large amount of interest in 
attitude toward a task as influenc- 
ing success in that task, the sugges- 
tion offered by an experimenter that 
intellectual conflict might (sic) uncon- 
sciously act as a score depressing fac- 
tor scems a modest one. 

In the matter of replication one 
notices, as one does in work that is 
half-experimental, half-clinical, the 
effort to get a psychologically mean- 
ingful and highly motivated task, and 
the willingness to abandon a pro- 
cedure which seems to be leading into 
a cul de sac. The question then arises 
whether, along with the shift to a new 
method, there is a consistent attempt 
to create psychological conditions 
which will favor a phenomenon which 
is so little understood. It seems to 
me, without this freedom to alter at- 
titudes experimentally, one would 
not get far in any frontier area. One 
question, of course, is whether the 
new hypotheses are pursued long 
enough. This does not result solely 
from an oversight. Rather, it results 
from the intrinsic difficulty of de- 
veloping workable hypotheses which 
arise from our present limited knowl- 
ledge, and keeping experimenters 
with their noses to the grindstones to 
pursue these relentlessly, even at the 
cost of abandoning all the interesting 
possibilities which occur to experi- 
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menters in a new field. It is a very 
human situation. I think one pays a 
considerable price for the lack of in- 
terest in replication; in fact, I have 
been consistently a pleader for more 
and more narrowness, and the denial 
to oneself of challenging new oppor- 
tunities, because that is the only way 
in which I could maintain my demand 
for replication. But he is a pretty wise 
man who can decide what, in a new 
science, is the soundest procedure. To 
accuse investigators, however, of 
"free wheeling," or lack of hypotheses 
in papers which abound in sugges- 
tions for new work, as are those by 
Forwald (1952), seems unnecessarily 
arbitrary. 

It is also difficult to understand 
why there is repeated emphasis by 
Girden on the point that the cardinal 
control method must be the compari- 
son of scores made during wishing 
with scores made under nonwishing 
conditions. There are many other 
types of controls. In fact, the com- 
parison of individual die faces for 
which one wishes with other die faces 
for which one does not wish seems to be 
accepted and stressed by him else- 
where in his paper as a salient and 
necessary control. Moreover, in the 
placement tests the area into which 
one tries to push the dice is compared 
with the area into which one does not 
wish them to move, and this is prop- 
erly accepted asa control. Moreover, 
the questions about decline effects are 
regarded by him as legitimate ques- 
tions to raise, although in these cases 
there is no explicit wish for a decline 
and no wish during the course of the 
experiment to progress from hits in 
one position to hits in another posi- 
tion. 

The matter of bias in dice is a very 
old question. It was faced by Rhine 
and the Duke group at the very be- 
ginning of their work. With most 
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dice the amount of material exca- 
vated for placement of the dots re- 
sults in bias in favor of sixes and for 
the most part down through with the 
decline in the value of the dice, so 
that the Number 1 is the least fre- 
quently thrown. Knowledge of dice 
bias is not, however, always com- 
bined with awareness of the fact that 
differential wear and tear, chipping, 
smoothing, and other frictional and 
gravitational factors are fantastically 
complicated and that such “perfect 
dice" as are advocated by Scarne 
(1956) produce their own problems 
quite quickly. At the American So- 
ciety for Psychical Research (ASPR), 
for example, we carefully considered 
the question, tried out the perfect 
dice and decided on grounds similar 
to those of Rhine, that itis very much 
better to work with an internal con- 
trol in the experimental design, now 
trying for one face and now for 
another, targets being randomized or 
systematically varied in a manner un- 
correlated with wear and tear. 
Though there were in the early days 
of the Duke laboratories some un- 
fortunate lapses, the fact remains 
that unawareness of this phenomenon 
was not something one could throw 
at the experimenters, suggesting es- 
sentially that any intelligent school- 
boy" would know better than to pro- 
ceed with such dice as are commerci- 
ally available. Dice bias has been à 
primary problem all along. There are 
dozens of situations in psychology 1n 
which one uses imperfectly balan 
material and "rotates out" the de- 
fects by one sort of control method or 
another. This is what has been done 
in PK research. Girden has made à 
valuable point in showing that some 
of the decline effects may have been 
due to using too many sixes in early 
series and too many ones in later 
series, etc., but this is a spec 
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critique of a few specific experiments 
and does not cover all decline effects, 
several of which have appeared very 
clearly even under the conditions of 
rotated targets just described. 

In discussion of the "probability 
model" Girden feels that experimen- 
ters make a "basic assumption" that 
known factors are randomly dis- 
tributed and hence, when obtaining 
deviations from the theoretical pre- 
dictions, must be dealing with the un- 
knowns. Rather, it would seem to me 
that investigators recurrently en- 
countering unexplained deviations 
from a probability model have a clear 
scientific obligation to pursue such 
matters and try out hypotheses by 
experimental intervention, that is, 
intervention through the use of inde- 
pendent variables. “Wishing” is pre- 
cisely such an independent variable, 
introduced to see whether the effect is 
in the expected direction. The same 
is true of a few cases in which the ex- 
pectation is that the objects will move 
in a manner diametrically opposed to 
that of the wish, and here likewise 
the hypothesis has been tested. 

Now a few more words about repli- 
cation. In view of the large-scale ef- 
fort made by Dale (1946) in her “First 
ASPR Experiment,” and the positive 
results in total hits it is puzzling to 
learn that the failure of similar re- 
sults to appear in later tests in which 


— and other conditions 
hat altered, is taken as evi- 
dence against the reality of the first 
results. Compare Urey above. Per- 
haps Dale and other experimenters 
should go on to the bitter end, repeat- 
ing and repeating. From the account 
given it seems likely that this se- 
quence of events characterized her 
effort: initial enthusiasm, satiation, 
anxiety, and inability to get back the 
psychological conditions of the initial 
success. Is it appropriate psycho- 


among psychologists a» to how to 
proceed in such an instance. She re 
once with 


working condition 
and partly statistical, since one 
set of results is significant at the .00$ 
level, and the other not significant. 
A more dramatic example of Gir- 
den's position is his reporting of 
extensive experimental work of 


Swedish engineer, Forwald (1952). 
The basic belief that no solo work can 
ever be sound reappears. The later 
successful repetition of the subject's 
work in the presence of witnesses is 


not enough to get rid of the 
contamination. 

Forwald's work, apparently 
cause some of it was solo is 
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the A region. Throughout the reports 
making use of this table there is 
tic alternation between use of 


point, however, is that with proper 
alternation of A and B, the A is penal- 
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ized when nonpreferred just as it is 
favored when preferred. 
Both series plainly show the favoring of the 
“A” side of the target already mentioned, an 
effect which could, of course, have been re- 
duced by altering the apparatus, but H. F. 
did not consider it advisable to make the 
changes during the course of the experiment. 
The fact that each side was used as 
target an equal number of times com- 
pensated for the bias in the appara- 
tus. 
Table I also shows the results of the two con- 
trolseries. It may be seen that although they 
gave a total deviation that was negative in- 
stead of positive (that is, the total number of 
dice falling into the target area was below the 
number expected on the chance theory), 
nevertheless the favoring of the “A” side over 
the “B” side continued about as markedly as 
before. This is what one would expect if the 
favoring of the “A” side in the experimental 
series was due chiefly to bias in the machine 
rather than to PK. 


Yet it is in reference to this kind of 
reasonable comment by an experi- 
menter that Girden finds it appropri- 
ate to berate the experimenter for in- 
attention to equipment bias and for 
lack of control series. Girden's com- 
ment on this was that here was a 
patent flaw which should at once have 
been corrected. It seems to me that 
this is a question for an experimen- 
ter's judgment, regarding which out- 
side observers or readers are hardly 
likely to have any great wisdom to 
offer. When Forwald, in a later ex- 
periment, continues to use the same 
table, he is reprimanded by Girden 
for this and reprimanded for not re- 
stating the fact that the table had a 
bias. Since, however, the same built- 
in controls are involved as in the 
earlier experiment, it would appear to 
me that this is a stylistic matter; the 
reader might well want to compare 
the two studies, noting that the same 
table was used. Later, when Forwald 
came to the United States, he built a 
new table which was free of bias, and 
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then both alone and with observers 
present he proceeded to get the same 
sort of placement PK effect that he 
had gotten in Sweden. 

The Forwald data also yielded in- 
teresting material, replicated at least 
twice, indicating a U curve in the 
course of the hits in five consecutive 
tasks; that is, initial and terminal suc- 
cesses are commoner than those in the 
middle. Rereading these Forwald 
studies and rereading the Girden 
analysis, I cannot find what is seri- 
ously wrong with the experiments. 
What is featured is the fact that ini- 
tially Forwald worked alone. What 
bearing this has on the situation in 
PK research today is not clear to me. 

Incidentally, data from a control 
series following immediately upon ex- 
perimental series are at times some- 
what like the experimental series 
rather than like what is expected 
from a probability model. Is there 
not a possibility that there is a psy- 
chological perseveration from the 
task into the control, particularly if 
the longer one continues to gather the 
data, the less the control resembles 
experimental series? Of course, this 
has to be independently tested but it 
is a reasonable hypothesis as offered 
by Forwald. 

Regarding this Forwald series as a 
whole, the use of hostile phraseology 
seems to call for some attention. Re- 
garding the Pratt-Forwald (£958) 
experiment in particular, to which 
special importance has been attached 
(in which Pratt and Forwald worked 
through a long series of preliminary 
studies to see what the best psycho- 
logical atmosphere was, then made 
predictions of a highly specific nature 
as to what the subject would do and 
then got the expected results), Girden 
comments: In terms of planned ex- 
perimentation, this study was free- 
wheeling, with no predetermined de- 
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sign and no recorders who were ig- 
norant of the wished for area." In 
view of the fact that the correlations 
of observers in the Forwald series ap- 
pear to have been of the order of .99, 
one cannot help wondering if the ar- 
tillery used is somewhat heavier than 
required in scientific writing. It is 
certainly true that it would have been 
fine to get photographic records. 1 
often wish that we had them our- 
selves for experiments in which we 
study the role of autistic factors in 
perception, where we find that such 
autistic factors as an expression of the 
experimenter's personality are nearly 
as dramatic as when they are an ex- 
pression of the subjects personality. 
The absolute or ideal experiment has 
not, as already noted, been per- 
formed. 

Thinking over this question wheth- 
er you must always set up a fresh 
experiment if you are going to test à 
hypothesis that occurs to you after 
the event, I wonder how this would 
work in paleontology, or in the use of 
radioactive carbon to date historical 
events. It is not actually the predic- 
tion in terms of a fresh experiment 
that constitutes the ideal method, be- 
cause the situation is not recoverable 
to this degree. Very often you go 
back over periods of time in which 
you think physical, biological, and 
psychological conditions are probably 
similar, and see whether you continue 
to get verification of a hypothesis 
which has occurred to you. If you can 
keep yourself uncontaminated, you 
may be able to find out something. 
When PK ‘‘decline curves," for ex- 
ample, were discovered, the aim was 
to go back and see whether they 
would recur in other material where 
they had not been suspected. This is 
not an ideal method, but it is an im- 
portant step and hardly to be be- 
littled. It is certainly true that the 
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common type of decline curve cannot 
be counted upon to reappear. There 
are, however, other types of decline 
curves, such as the rapid decline in 
effects which appear from properly 
randomized first calls or first blocks 
of calls as in Dale’s work; these are of 
real psychological interest and call for 
much further work. 

Again in the matter of decline 
curves, the fact that Forwald's out- 
standing result is with the first trial in 
fresh series, leads Girden to comment 
that "physical conditions were likely 
to be unsteadiest“ during such first 
calls. Whether this is true, I am not 
enough of a physicist to say, but I am 
sure that there is a psychological dif- 
ference between the first and subse- 
quent calls, and I wouldn't want to 
pursue this as far as Forwald did 
(incidentally Forwald is an engineer). 
Girden also thinks that the dice 
throwing machine used by McConnell 
in the Biophysics Department at the 
University of Pittsburgh may, as the 
result of night usage, have been sub- 
ject to a warm-up effect the following 
day. What would the warm-up effect 
do as far as decline curves in re- 
sponses to randomized target orders 
are concerned? 

The Girden report may have been 
prepared for publication sometime 
ago, for it makes no comment on 
several interesting recent reports on 
PK, notably one by Fahler (1959). 
Here the equipment last used at Duke 
by Forwald was used, alternating 
with each of the nine subjects from an 
A side to a B side of a dividing line in 
the middle of the table, so that sub- 
jects could wish for movement of the 
dice toward the one side or the other. 
As in Forwald's work, imperfections 
in the table or the rest of the equip- 
ment favoring one or the other side 
appear to be controlled by the experi- 
mental alternation between the two. 
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In the present experiment the same 
six cubes are used throughout the ex- 
periment, and whatever imperfec- 
tions they contain must affect scores 
both in the five A-side trials and the 
five B-side trials which alternate with 
them. In Fahler's tests, preliminary 
work had shown which of the nine 
subjects were in general known 
around the laboratory as likely to ob- 
tain high scores in both PK and 
earlier clairvoyance tests, and which 
subjects could be expected to perform 
below chance level. The predictions 
were correct in eight out of the nine 
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cases, and when the scores achieved 
by all of the subjects are compared by 
t test, a significant difference between 
the positively predicted and the nega- 
tively predicted appears. Perhaps 
Girden would like to add a comment 
on this and several other PK papers 
which have been written within the 
year. Itis true that no photographic 
record of positions of the dice are 
offered, but there is considerable in- 
terest, as there is in other recent 
papers, in the comparison of inde- 
pendent observers and the determina- 
tion of a high order of agreement. 
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A POSTSCRIPT TO “A REVIEW OF 
PSYCHOKINESIS (PK)" 
EDWARD GIRDEN 
Brooklyn College! 


The opportunity has been afforded 
the writer to consider a report pre- 
pared by Murphy (1962), on “A Re- 
view of Psychokinesis (PK)" (Gir- 
den, 1962). Since no complete review 
oí PK had previously appeared—in 
American Psychological Association 
journals, or elsewhere for that mat- 
ter—the amount of detailed presenta- 
tion required for a professional audi- 
ence unfamiliar with the subject 
matter is not simply resolved. Space 
in scientific journals is always at a 
premium and its allocation is an edi- 
torial responsibility. In a specific edi- 
torial directive to condense the sub- 
mitted manuscript, the printed review 
is a shortened version of the original 
manuscript and must be considered 
in this light. 

How much detail is required to 
afford insight to the reader? Murphy 
(1962) comments that “I have said 
more than says about the Mc- 
Connell et al dies and the Dale 
studies. Indeed, in many cases his 
accounts of experimental methods are 
so brief that it is hard to comprehend 
what E was trying to do, what his 
methods were, and what results he 
obtained.” Yet with respect to the 
McConnell, Snowden, and Powell 
(1955) and the Dale (1946; Dale & 
Woodruff, 1947) studies, excluding 
the tabular description of experi- 
mental conditions and results, more 
than twice as many typed manuscript 
lines were used in the review by the 


1 Grateful acknowledgement is made for the 
Thomas Welton Stanford Fellowship (Stan- 
ford University) and the John Simon Guggen- 
heim Fellowship which made possible this 
effort. 


present writer than in the manuscript 
for the report. Further reflection at 
this time suggests that the presenta- 
tion of McConnell et al. (1955), car- 
ried out in 1948-S0, might have been 
further condensed since the authors 
consider the Journal of Experimental 
Psychology article a preliminary state- 
ment to be followed by a more ex- 
tensive report. 

Let us also consider the treatment 
of Binski's (1950) data, “on two ex- 
ploratory series," which Murphy ex- 
amines in detail. The text and Table 
3 of the review demonstrate that all 
essential conditions were reported, in- 
cluding the number of Ss on the rou- 
lette test (V- 123) and on the coin 
(100 per throw) test (N 117). Mur- 
phy emphasizes that one S, in coin 
tossing, scored "a very significant 
number of hits [548 out of 1,000] and 
he did as well or better in each of the 
next four throws" The deletion 
from my final version of the manu- 
script of the statement that this S was 
also superior on roulette guessing does 
not constitute a misrepresentation, 
since the review noted that some new 
unpublished tests with this S “were 
'encouraging but not statistically 
significant’ " (Girden, 1962). As far 
as is known, no further reports about 
this Shave appeared. But the data at 
Las Vegas are more convincing in this 
connection. One player held“ the 
dice for about 1 hour and 20 minutes.“ 


2 R. A. McConnell, personal communica- 
tion, 1959. 

3 On a first throw, the total of the faces of a 
pair of dice, from 4 to 10 inclusive, con- 
stitutes a point to be made before 7 subse- 
quently appears. If a 7 appears before the 
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Consultation, as I have found, with 
gamblers where betting is legal or 
with any other devotees of this aspect 
of our culture will uncover even more 
startling data. Murphy apparently 
selects what appears to be a run of 
luck on two tests with one S out of 
117 Ss as significant for PK. 

One occupational hazard in pre- 
paring a review article is the failure to 
obtain complete coverage. The 
present effort has been convincing 
that complete coverage cannot be ob- 
tained from the published record. 
Indeed, I am much indebted to many 
people here and abroad who in per- 
sonal conversations called attention 
to studies to which I had not seen 
reference as well as results not avail- 
able in the published reports. It is 
not clear, however, why Fahler’s 
(1959) paper was overlooked since 
the journal in which it appeared was 
one that had been thoroughly 
searched. I am therefore pleased that 
this unfortunate omission has been 
detected. But an examination of the 
results does not appear to change the 
basic issues. 

This is not the appropriate place to 
deal with the problem of the psy- 
chology of scientific controversy, an 
interesting topic on which little re- 
search has been done (Boring, 1929, 
1952, 1961). Psychical research, of 


point, the dice are lost to the next player. If 
the point is made, the next ‘‘first’’ throw is the 
new point to be made, again before a 7 sub- 
sequently appears. (On a first throw, 7 as well 
as 11 is a point. Two, 3, or 12 appearing as a 
first throw results in the loss of the bet, but 
the dice are retained. The next throw con- 
stitutes a "first" throw.) As a result of cau- 
tion, the player left after winning only a few 
thousand dollars. 
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which PK is one phase, is decidedly 
controversial. Hereare expressions of 
opinion by Murphy (1948) regarding 
psychic research made in other con- 
texts: 


insofar as psychical research can ultimately 
qualify as an experimental science, it will be 
forced, as are all experimental sciences, to 
develop truly repeatable*experiments; experi- 
ments which. .. can be independently re- 
peated by any competent investigator. ... 
It is hard to see even the beginnings of a 
science of parapsychology until a repeatable 
experiment . .. is obtained (pp. 18f.). 


And more recently he commented 
that: 


The fundamental rule in laboratory science is 
that you truly have captured a phenomenon 
and begin to understand it only when you can 
so fully specify the conditions which engender 
it, that you can yourself make it happen again 
and again, and other qualified workers can do 
the same. We do not, for the most part, even 
attempt this in parapsychology (Murphy, 
1958, p. 233). 


These statements by Murphy appear 
to agree fully with the tenor of my 
review. 

With respect to PK as well as ex- 

trasensory perception, the present 
writer's opinion coincides with that 
expressed by Murphy (1957), when 
he said: 
I notea very marked petering-out of successful 
experimentation generally . . . not only is 
there a loss of spontaneous cases and medium- 
istic phenomena. The quantitative data them- 
selves are sparse; slight effects have to be 
played up; large areas are barren; repetitions 
are rare... (p. 175). 


Concerning the existence of PK, this 
writer has no strong opinion pro Or 
con but, on the basis of the available 
evidence, the soundest judgment is à 
Scottish verdict: not proven. 
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ERRATA? * 


In the article entitled An Exact Multinomial One- Sample Test of Signif- 
icance” by A. Chapanis (Psychol. Bull., 1962, 59, 306-310) one of the terms 
in Equation 3, page 306, is incorrect. The equation should read as follows: 


k! N! 1\¥ 
= eee . ‚ ‚ 1 
Q)*D^--- GH (R^ mimm c mb om G) 


On page 307 the first line of computations illustrating the application of 
Equation 3 to a particular set of data should read as follows: 


GEARS 121 E 
~ (2nw(3)9 81341 \3 


On page 308 delete the first two lines of computations illustrating the ap- 
plication of Equation 3 to another set of data. The computations on the 
third line of that illustration, and all other computations in the article, are 
correct. 


In Equation 4, page 310, the quantity un should be read as n;. 


1 The author is indebted to Joseph L. Fleiss, Biometrics Research, New Vork State De- 
partment of Mental Hygiene, for calling these errors to his attention. 


& 
e dd 


